Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next 00/11] Time Counter fixes and improvements
From: Richard Cochran @ 2014-12-21 18:46 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky

Several PTP Hardware Clock (PHC) drivers implement the clock in
software using the timecounter/cyclecounter code. This series adds one
simple improvement and one more subtle fix to the shared timecounter
facility. Credit for this series goes to Janusz Użycki, who pointed
the issues out to me off list.

Patch #1 simply move the timecounter code into its own file. When
working on this series, it was really annoying to see half the kernel
recompile after every tweak to the timecounter stuff. There is no
reason to keep this together with the clocksource code.

Patch #2 implements an improved adjtime() method, and patches 3-10
convert all of the drivers over to the new method.

Patch #11 fixes a subtle but important issue with the timecounter WRT
frequency adjustment. As it stands now, a timecounter based PHC will
exhibit a variable frequency resolution (and variable time error)
depending on how often the clock is read.

In timecounter_read_delta(), the expression

   (delta * cc->mult) >> cc->shift;

can lose resolution from the adjusted value of 'mult'. If the value
of 'delta' is too small, then small changes in 'mult' have no effect.
However, if the delta value is large enough, then small changes in
'mult' will have an effect.

Reading the clock too often means smaller 'delta' values which in turn
will spoil the fine adjustments made to 'mult'. Up until now, this
effect did not show up in my testing. The following example explains
why.

The CPTS has an input clock of 250 MHz, and the clock source uses
mult=0x80000000 and shift=29, making the ticks to nanoseconds
conversion like this:

   ticks * 2^31
   ------------
       2^29

Imagine what happens if the clock is read every 10 milliseconds. Ten
milliseconds are about 2500000 ticks, which corresponds to about 21
bits. The product in the numerator has then 52 bits. After the shift
operation, 23 bits are preserved. This results in a frequency
adjustment resolution of about 0.1 ppm (not _too_ bad.)

A frequency resolution of 1 ppm requires 20 bits.
A frequency resolution of 1 ppb requires 30 bits.

For the 250 MHz CPTS clock, reading every 4 seconds yields a 1 ppb
resolution (which is the finest that our API allows).

However, the error can be much higher if the clock is read too often
or if time stamps occur close in time to read operations. In general
it is really not acceptable to allow the rate of clock readings to
influence the clock accuracy.

Thanks,
Richard

Richard Cochran (11):
  time: move the timecounter/cyclecounter code into its own file.
  timecounter: provide a helper function to shift the time.
  net: xgbe: convert to timecounter adjtime.
  net: bnx2x: convert to timecounter adjtime.
  net: fec: convert to timecounter adjtime.
  net: e1000e: convert to timecounter adjtime.
  net: igb: convert to timecounter adjtime.
  net: ixgbe: convert to timecounter adjtime.
  net: mlx4: convert to timecounter adjtime.
  net: cpts: convert to timecounter adjtime.
  timecounter: keep track of accumulated fractional nanoseconds

 drivers/net/ethernet/amd/xgbe/xgbe-ptp.c         |    8 +-
 drivers/net/ethernet/amd/xgbe/xgbe.h             |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h      |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    6 +-
 drivers/net/ethernet/freescale/fec.h             |    1 +
 drivers/net/ethernet/freescale/fec_ptp.c         |   16 +--
 drivers/net/ethernet/intel/e1000e/e1000.h        |    2 +-
 drivers/net/ethernet/intel/e1000e/ptp.c          |    5 +-
 drivers/net/ethernet/intel/igb/igb.h             |    2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c         |    7 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |    2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c     |   11 +-
 drivers/net/ethernet/mellanox/mlx4/en_clock.c    |    9 +-
 drivers/net/ethernet/ti/cpts.c                   |    5 +-
 drivers/net/ethernet/ti/cpts.h                   |    1 +
 include/clocksource/arm_arch_timer.h             |    2 +-
 include/linux/clocksource.h                      |  102 ----------------
 include/linux/mlx4/device.h                      |    2 +-
 include/linux/timecounter.h                      |  136 ++++++++++++++++++++++
 include/linux/types.h                            |    3 +
 kernel/time/Makefile                             |    2 +-
 kernel/time/clocksource.c                        |   76 ------------
 kernel/time/timecounter.c                        |  112 ++++++++++++++++++
 sound/pci/hda/hda_priv.h                         |    2 +-
 virt/kvm/arm/arch_timer.c                        |    3 +-
 25 files changed, 274 insertions(+), 245 deletions(-)
 create mode 100644 include/linux/timecounter.h
 create mode 100644 kernel/time/timecounter.c

-- 
1.7.10.4

^ permalink raw reply

* [PATCH net-next 02/11] timecounter: provide a helper function to shift the time.
From: Richard Cochran @ 2014-12-21 18:46 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

Some PTP Hardware Clock drivers use a struct timecounter to represent
their clock. To adjust the time by a given offset, these drivers all
perform a two step read/write of their timecounter. However, it is
better and simpler just to adjust the offset in one step. This patch
introduces a little routine to help drivers implement the adjtime
method.

Suggested-by: Janusz Użycki <j.uzycki@elproma.com.pl>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 include/linux/timecounter.h |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h
index 146f07a..af3dfa4 100644
--- a/include/linux/timecounter.h
+++ b/include/linux/timecounter.h
@@ -79,6 +79,15 @@ static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc,
 }
 
 /**
+ * timecounter_adjtime - Shifts the time of the clock.
+ * @delta:	Desired change in nanoseconds.
+ */
+static inline void timecounter_adjtime(struct timecounter *tc, s64 delta)
+{
+	tc->nsec += delta;
+}
+
+/**
  * timecounter_init - initialize a time counter
  * @tc:			Pointer to time counter which is to be initialized/reset
  * @cc:			A cycle counter, ready to be used.
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 04/11] net: bnx2x: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:46 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Compile tested only.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 691f0bf..7b2cc40 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -13265,14 +13265,10 @@ static int bnx2x_ptp_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
 static int bnx2x_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 {
 	struct bnx2x *bp = container_of(ptp, struct bnx2x, ptp_clock_info);
-	u64 now;
 
 	DP(BNX2X_MSG_PTP, "PTP adjtime called, delta = %llx\n", delta);
 
-	now = timecounter_read(&bp->timecounter);
-	now += delta;
-	/* Re-init the timecounter */
-	timecounter_init(&bp->timecounter, &bp->cyclecounter, now);
+	timecounter_adjtime(&bp->timecounter, delta);
 
 	return 0;
 }
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 05/11] net: fec: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:47 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Compile tested only.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/freescale/fec_ptp.c |   16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index 992c8c3..1f9cf23 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -374,23 +374,9 @@ static int fec_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	struct fec_enet_private *fep =
 	    container_of(ptp, struct fec_enet_private, ptp_caps);
 	unsigned long flags;
-	u64 now;
-	u32 counter;
 
 	spin_lock_irqsave(&fep->tmreg_lock, flags);
-
-	now = timecounter_read(&fep->tc);
-	now += delta;
-
-	/* Get the timer value based on adjusted timestamp.
-	 * Update the counter with the masked value.
-	 */
-	counter = now & fep->cc.mask;
-	writel(counter, fep->hwp + FEC_ATIME);
-
-	/* reset the timecounter */
-	timecounter_init(&fep->tc, &fep->cc, now);
-
+	timecounter_adjtime(&fep->tc, delta);
 	spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 06/11] net: e1000e: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:47 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/intel/e1000e/ptp.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c
index fb1a914..978ef9c 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -90,12 +90,9 @@ static int e1000e_phc_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
 						     ptp_clock_info);
 	unsigned long flags;
-	s64 now;
 
 	spin_lock_irqsave(&adapter->systim_lock, flags);
-	now = timecounter_read(&adapter->tc);
-	now += delta;
-	timecounter_init(&adapter->tc, &adapter->cc, now);
+	timecounter_adjtime(&adapter->tc, delta);
 	spin_unlock_irqrestore(&adapter->systim_lock, flags);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 07/11] net: igb: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:47 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/intel/igb/igb_ptp.c |    7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 794c139..1d27f2d 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -256,14 +256,9 @@ static int igb_ptp_adjtime_82576(struct ptp_clock_info *ptp, s64 delta)
 	struct igb_adapter *igb = container_of(ptp, struct igb_adapter,
 					       ptp_caps);
 	unsigned long flags;
-	s64 now;
 
 	spin_lock_irqsave(&igb->tmreg_lock, flags);
-
-	now = timecounter_read(&igb->tc);
-	now += delta;
-	timecounter_init(&igb->tc, &igb->cc, now);
-
+	timecounter_adjtime(&igb->tc, delta);
 	spin_unlock_irqrestore(&igb->tmreg_lock, flags);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 08/11] net: ixgbe: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:47 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Compile tested only.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c |   11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
index 5fd4b52..47c29ea 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
@@ -261,18 +261,9 @@ static int ixgbe_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	struct ixgbe_adapter *adapter =
 		container_of(ptp, struct ixgbe_adapter, ptp_caps);
 	unsigned long flags;
-	u64 now;
 
 	spin_lock_irqsave(&adapter->tmreg_lock, flags);
-
-	now = timecounter_read(&adapter->tc);
-	now += delta;
-
-	/* reset the timecounter */
-	timecounter_init(&adapter->tc,
-			 &adapter->cc,
-			 now);
-
+	timecounter_adjtime(&adapter->tc, delta);
 	spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
 
 	ixgbe_ptp_setup_sdp(adapter);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 09/11] net: mlx4: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:47 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Compile tested only.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_clock.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_clock.c b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
index 9990144..df35d0e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
@@ -147,12 +147,9 @@ static int mlx4_en_phc_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	struct mlx4_en_dev *mdev = container_of(ptp, struct mlx4_en_dev,
 						ptp_clock_info);
 	unsigned long flags;
-	s64 now;
 
 	write_lock_irqsave(&mdev->clock_lock, flags);
-	now = timecounter_read(&mdev->clock);
-	now += delta;
-	timecounter_init(&mdev->clock, &mdev->cycles, now);
+	timecounter_adjtime(&mdev->clock, delta);
 	write_unlock_irqrestore(&mdev->clock_lock, flags);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 10/11] net: cpts: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:47 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/ti/cpts.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 4a4388b..fbe42cb 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -157,14 +157,11 @@ static int cpts_ptp_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
 
 static int cpts_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 {
-	s64 now;
 	unsigned long flags;
 	struct cpts *cpts = container_of(ptp, struct cpts, info);
 
 	spin_lock_irqsave(&cpts->lock, flags);
-	now = timecounter_read(&cpts->tc);
-	now += delta;
-	timecounter_init(&cpts->tc, &cpts->cc, now);
+	timecounter_adjtime(&cpts->tc, delta);
 	spin_unlock_irqrestore(&cpts->lock, flags);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 01/11] time: move the timecounter/cyclecounter code into its own file.
From: Richard Cochran @ 2014-12-21 18:46 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

The timecounter code has almost nothing to do with the clocksource
code. Let it live in its own file. This will help isolate the
timecounter users from the clocksource users in the source tree.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/amd/xgbe/xgbe.h        |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h |    2 +-
 drivers/net/ethernet/freescale/fec.h        |    1 +
 drivers/net/ethernet/intel/e1000e/e1000.h   |    2 +-
 drivers/net/ethernet/intel/igb/igb.h        |    2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe.h    |    2 +-
 drivers/net/ethernet/ti/cpts.h              |    1 +
 include/clocksource/arm_arch_timer.h        |    2 +-
 include/linux/clocksource.h                 |  102 ----------------------
 include/linux/mlx4/device.h                 |    2 +-
 include/linux/timecounter.h                 |  122 +++++++++++++++++++++++++++
 include/linux/types.h                       |    3 +
 kernel/time/Makefile                        |    2 +-
 kernel/time/clocksource.c                   |   76 -----------------
 kernel/time/timecounter.c                   |   95 +++++++++++++++++++++
 sound/pci/hda/hda_priv.h                    |    2 +-
 16 files changed, 231 insertions(+), 187 deletions(-)
 create mode 100644 include/linux/timecounter.h
 create mode 100644 kernel/time/timecounter.c

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe.h b/drivers/net/ethernet/amd/xgbe/xgbe.h
index f9ec762..2af6aff 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe.h
+++ b/drivers/net/ethernet/amd/xgbe/xgbe.h
@@ -124,7 +124,7 @@
 #include <linux/if_vlan.h>
 #include <linux/bitops.h>
 #include <linux/ptp_clock_kernel.h>
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 #include <linux/net_tstamp.h>
 #include <net/dcbnl.h>
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index c3a6072..792ba72 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -22,7 +22,7 @@
 
 #include <linux/ptp_clock_kernel.h>
 #include <linux/net_tstamp.h>
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 
 /* compilation time flags */
 
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 469691a..df8bbdd 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -16,6 +16,7 @@
 #include <linux/clocksource.h>
 #include <linux/net_tstamp.h>
 #include <linux/ptp_clock_kernel.h>
+#include <linux/timecounter.h>
 
 #if defined(CONFIG_M523x) || defined(CONFIG_M527x) || defined(CONFIG_M528x) || \
     defined(CONFIG_M520x) || defined(CONFIG_M532x) || \
diff --git a/drivers/net/ethernet/intel/e1000e/e1000.h b/drivers/net/ethernet/intel/e1000e/e1000.h
index 7785240..9416e5a 100644
--- a/drivers/net/ethernet/intel/e1000e/e1000.h
+++ b/drivers/net/ethernet/intel/e1000e/e1000.h
@@ -34,7 +34,7 @@
 #include <linux/pci-aspm.h>
 #include <linux/crc32.h>
 #include <linux/if_vlan.h>
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 #include <linux/net_tstamp.h>
 #include <linux/ptp_clock_kernel.h>
 #include <linux/ptp_classify.h>
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 82d891e..ee22da3 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -29,7 +29,7 @@
 #include "e1000_mac.h"
 #include "e1000_82575.h"
 
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 #include <linux/net_tstamp.h>
 #include <linux/ptp_clock_kernel.h>
 #include <linux/bitops.h>
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index b6137be..38fc64c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -38,7 +38,7 @@
 #include <linux/if_vlan.h>
 #include <linux/jiffies.h>
 
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 #include <linux/net_tstamp.h>
 #include <linux/ptp_clock_kernel.h>
 
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index 1a581ef..69a46b9 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -27,6 +27,7 @@
 #include <linux/list.h>
 #include <linux/ptp_clock_kernel.h>
 #include <linux/skbuff.h>
+#include <linux/timecounter.h>
 
 struct cpsw_cpts {
 	u32 idver;                /* Identification and version */
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index 6d26b40..9916d0e 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -16,7 +16,7 @@
 #ifndef __CLKSOURCE_ARM_ARCH_TIMER_H
 #define __CLKSOURCE_ARM_ARCH_TIMER_H
 
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 #include <linux/types.h>
 
 #define ARCH_TIMER_CTRL_ENABLE		(1 << 0)
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index abcafaa..9c78d15 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -18,8 +18,6 @@
 #include <asm/div64.h>
 #include <asm/io.h>
 
-/* clocksource cycle base type */
-typedef u64 cycle_t;
 struct clocksource;
 struct module;
 
@@ -28,106 +26,6 @@ struct module;
 #endif
 
 /**
- * struct cyclecounter - hardware abstraction for a free running counter
- *	Provides completely state-free accessors to the underlying hardware.
- *	Depending on which hardware it reads, the cycle counter may wrap
- *	around quickly. Locking rules (if necessary) have to be defined
- *	by the implementor and user of specific instances of this API.
- *
- * @read:		returns the current cycle value
- * @mask:		bitmask for two's complement
- *			subtraction of non 64 bit counters,
- *			see CLOCKSOURCE_MASK() helper macro
- * @mult:		cycle to nanosecond multiplier
- * @shift:		cycle to nanosecond divisor (power of two)
- */
-struct cyclecounter {
-	cycle_t (*read)(const struct cyclecounter *cc);
-	cycle_t mask;
-	u32 mult;
-	u32 shift;
-};
-
-/**
- * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds
- *	Contains the state needed by timecounter_read() to detect
- *	cycle counter wrap around. Initialize with
- *	timecounter_init(). Also used to convert cycle counts into the
- *	corresponding nanosecond counts with timecounter_cyc2time(). Users
- *	of this code are responsible for initializing the underlying
- *	cycle counter hardware, locking issues and reading the time
- *	more often than the cycle counter wraps around. The nanosecond
- *	counter will only wrap around after ~585 years.
- *
- * @cc:			the cycle counter used by this instance
- * @cycle_last:		most recent cycle counter value seen by
- *			timecounter_read()
- * @nsec:		continuously increasing count
- */
-struct timecounter {
-	const struct cyclecounter *cc;
-	cycle_t cycle_last;
-	u64 nsec;
-};
-
-/**
- * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds
- * @cc:		Pointer to cycle counter.
- * @cycles:	Cycles
- *
- * XXX - This could use some mult_lxl_ll() asm optimization. Same code
- * as in cyc2ns, but with unsigned result.
- */
-static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc,
-				      cycle_t cycles)
-{
-	u64 ret = (u64)cycles;
-	ret = (ret * cc->mult) >> cc->shift;
-	return ret;
-}
-
-/**
- * timecounter_init - initialize a time counter
- * @tc:			Pointer to time counter which is to be initialized/reset
- * @cc:			A cycle counter, ready to be used.
- * @start_tstamp:	Arbitrary initial time stamp.
- *
- * After this call the current cycle register (roughly) corresponds to
- * the initial time stamp. Every call to timecounter_read() increments
- * the time stamp counter by the number of elapsed nanoseconds.
- */
-extern void timecounter_init(struct timecounter *tc,
-			     const struct cyclecounter *cc,
-			     u64 start_tstamp);
-
-/**
- * timecounter_read - return nanoseconds elapsed since timecounter_init()
- *                    plus the initial time stamp
- * @tc:          Pointer to time counter.
- *
- * In other words, keeps track of time since the same epoch as
- * the function which generated the initial time stamp.
- */
-extern u64 timecounter_read(struct timecounter *tc);
-
-/**
- * timecounter_cyc2time - convert a cycle counter to same
- *                        time base as values returned by
- *                        timecounter_read()
- * @tc:		Pointer to time counter.
- * @cycle_tstamp:	a value returned by tc->cc->read()
- *
- * Cycle counts that are converted correctly as long as they
- * fall into the interval [-1/2 max cycle count, +1/2 max cycle count],
- * with "max cycle count" == cs->mask+1.
- *
- * This allows conversion of cycle counter values which were generated
- * in the past.
- */
-extern u64 timecounter_cyc2time(struct timecounter *tc,
-				cycle_t cycle_tstamp);
-
-/**
  * struct clocksource - hardware abstraction for a free running counter
  *	Provides mostly state-free accessors to the underlying hardware.
  *	This is the structure used for system time.
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 25c791e..f1e41b3 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -42,7 +42,7 @@
 
 #include <linux/atomic.h>
 
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 
 #define MAX_MSIX_P_PORT		17
 #define MAX_MSIX		64
diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h
new file mode 100644
index 0000000..146f07a
--- /dev/null
+++ b/include/linux/timecounter.h
@@ -0,0 +1,122 @@
+/*
+ * linux/include/linux/timecounter.h
+ *
+ * based on code that migrated away from
+ * linux/include/linux/clocksource.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#ifndef _LINUX_TIMECOUNTER_H
+#define _LINUX_TIMECOUNTER_H
+
+#include <linux/types.h>
+
+/**
+ * struct cyclecounter - hardware abstraction for a free running counter
+ *	Provides completely state-free accessors to the underlying hardware.
+ *	Depending on which hardware it reads, the cycle counter may wrap
+ *	around quickly. Locking rules (if necessary) have to be defined
+ *	by the implementor and user of specific instances of this API.
+ *
+ * @read:		returns the current cycle value
+ * @mask:		bitmask for two's complement
+ *			subtraction of non 64 bit counters,
+ *			see CLOCKSOURCE_MASK() helper macro
+ * @mult:		cycle to nanosecond multiplier
+ * @shift:		cycle to nanosecond divisor (power of two)
+ */
+struct cyclecounter {
+	cycle_t (*read)(const struct cyclecounter *cc);
+	cycle_t mask;
+	u32 mult;
+	u32 shift;
+};
+
+/**
+ * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds
+ *	Contains the state needed by timecounter_read() to detect
+ *	cycle counter wrap around. Initialize with
+ *	timecounter_init(). Also used to convert cycle counts into the
+ *	corresponding nanosecond counts with timecounter_cyc2time(). Users
+ *	of this code are responsible for initializing the underlying
+ *	cycle counter hardware, locking issues and reading the time
+ *	more often than the cycle counter wraps around. The nanosecond
+ *	counter will only wrap around after ~585 years.
+ *
+ * @cc:			the cycle counter used by this instance
+ * @cycle_last:		most recent cycle counter value seen by
+ *			timecounter_read()
+ * @nsec:		continuously increasing count
+ */
+struct timecounter {
+	const struct cyclecounter *cc;
+	cycle_t cycle_last;
+	u64 nsec;
+};
+
+/**
+ * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds
+ * @cc:		Pointer to cycle counter.
+ * @cycles:	Cycles
+ *
+ * XXX - This could use some mult_lxl_ll() asm optimization. Same code
+ * as in cyc2ns, but with unsigned result.
+ */
+static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc,
+				      cycle_t cycles)
+{
+	u64 ret = (u64)cycles;
+	ret = (ret * cc->mult) >> cc->shift;
+	return ret;
+}
+
+/**
+ * timecounter_init - initialize a time counter
+ * @tc:			Pointer to time counter which is to be initialized/reset
+ * @cc:			A cycle counter, ready to be used.
+ * @start_tstamp:	Arbitrary initial time stamp.
+ *
+ * After this call the current cycle register (roughly) corresponds to
+ * the initial time stamp. Every call to timecounter_read() increments
+ * the time stamp counter by the number of elapsed nanoseconds.
+ */
+extern void timecounter_init(struct timecounter *tc,
+			     const struct cyclecounter *cc,
+			     u64 start_tstamp);
+
+/**
+ * timecounter_read - return nanoseconds elapsed since timecounter_init()
+ *                    plus the initial time stamp
+ * @tc:          Pointer to time counter.
+ *
+ * In other words, keeps track of time since the same epoch as
+ * the function which generated the initial time stamp.
+ */
+extern u64 timecounter_read(struct timecounter *tc);
+
+/**
+ * timecounter_cyc2time - convert a cycle counter to same
+ *                        time base as values returned by
+ *                        timecounter_read()
+ * @tc:		Pointer to time counter.
+ * @cycle_tstamp:	a value returned by tc->cc->read()
+ *
+ * Cycle counts that are converted correctly as long as they
+ * fall into the interval [-1/2 max cycle count, +1/2 max cycle count],
+ * with "max cycle count" == cs->mask+1.
+ *
+ * This allows conversion of cycle counter values which were generated
+ * in the past.
+ */
+extern u64 timecounter_cyc2time(struct timecounter *tc,
+				cycle_t cycle_tstamp);
+
+#endif
diff --git a/include/linux/types.h b/include/linux/types.h
index a0bb704..6232382 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -213,5 +213,8 @@ struct callback_head {
 };
 #define rcu_head callback_head
 
+/* clocksource cycle base type */
+typedef u64 cycle_t;
+
 #endif /*  __ASSEMBLY__ */
 #endif /* _LINUX_TYPES_H */
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index f622cf2..c09c078 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -1,6 +1,6 @@
 obj-y += time.o timer.o hrtimer.o itimer.o posix-timers.o posix-cpu-timers.o
 obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o
-obj-y += timeconv.o posix-clock.o alarmtimer.o
+obj-y += timeconv.o timecounter.o posix-clock.o alarmtimer.o
 
 obj-$(CONFIG_GENERIC_CLOCKEVENTS_BUILD)		+= clockevents.o
 obj-$(CONFIG_GENERIC_CLOCKEVENTS)		+= tick-common.o
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index b79f39b..4892352 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -34,82 +34,6 @@
 #include "tick-internal.h"
 #include "timekeeping_internal.h"
 
-void timecounter_init(struct timecounter *tc,
-		      const struct cyclecounter *cc,
-		      u64 start_tstamp)
-{
-	tc->cc = cc;
-	tc->cycle_last = cc->read(cc);
-	tc->nsec = start_tstamp;
-}
-EXPORT_SYMBOL_GPL(timecounter_init);
-
-/**
- * timecounter_read_delta - get nanoseconds since last call of this function
- * @tc:         Pointer to time counter
- *
- * When the underlying cycle counter runs over, this will be handled
- * correctly as long as it does not run over more than once between
- * calls.
- *
- * The first call to this function for a new time counter initializes
- * the time tracking and returns an undefined result.
- */
-static u64 timecounter_read_delta(struct timecounter *tc)
-{
-	cycle_t cycle_now, cycle_delta;
-	u64 ns_offset;
-
-	/* read cycle counter: */
-	cycle_now = tc->cc->read(tc->cc);
-
-	/* calculate the delta since the last timecounter_read_delta(): */
-	cycle_delta = (cycle_now - tc->cycle_last) & tc->cc->mask;
-
-	/* convert to nanoseconds: */
-	ns_offset = cyclecounter_cyc2ns(tc->cc, cycle_delta);
-
-	/* update time stamp of timecounter_read_delta() call: */
-	tc->cycle_last = cycle_now;
-
-	return ns_offset;
-}
-
-u64 timecounter_read(struct timecounter *tc)
-{
-	u64 nsec;
-
-	/* increment time by nanoseconds since last call */
-	nsec = timecounter_read_delta(tc);
-	nsec += tc->nsec;
-	tc->nsec = nsec;
-
-	return nsec;
-}
-EXPORT_SYMBOL_GPL(timecounter_read);
-
-u64 timecounter_cyc2time(struct timecounter *tc,
-			 cycle_t cycle_tstamp)
-{
-	u64 cycle_delta = (cycle_tstamp - tc->cycle_last) & tc->cc->mask;
-	u64 nsec;
-
-	/*
-	 * Instead of always treating cycle_tstamp as more recent
-	 * than tc->cycle_last, detect when it is too far in the
-	 * future and treat it as old time stamp instead.
-	 */
-	if (cycle_delta > tc->cc->mask / 2) {
-		cycle_delta = (tc->cycle_last - cycle_tstamp) & tc->cc->mask;
-		nsec = tc->nsec - cyclecounter_cyc2ns(tc->cc, cycle_delta);
-	} else {
-		nsec = cyclecounter_cyc2ns(tc->cc, cycle_delta) + tc->nsec;
-	}
-
-	return nsec;
-}
-EXPORT_SYMBOL_GPL(timecounter_cyc2time);
-
 /**
  * clocks_calc_mult_shift - calculate mult/shift factors for scaled math of clocks
  * @mult:	pointer to mult variable
diff --git a/kernel/time/timecounter.c b/kernel/time/timecounter.c
new file mode 100644
index 0000000..59a1ec3
--- /dev/null
+++ b/kernel/time/timecounter.c
@@ -0,0 +1,95 @@
+/*
+ * linux/kernel/time/timecounter.c
+ *
+ * based on code that migrated away from
+ * linux/kernel/time/clocksource.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/export.h>
+#include <linux/timecounter.h>
+
+void timecounter_init(struct timecounter *tc,
+		      const struct cyclecounter *cc,
+		      u64 start_tstamp)
+{
+	tc->cc = cc;
+	tc->cycle_last = cc->read(cc);
+	tc->nsec = start_tstamp;
+}
+EXPORT_SYMBOL_GPL(timecounter_init);
+
+/**
+ * timecounter_read_delta - get nanoseconds since last call of this function
+ * @tc:         Pointer to time counter
+ *
+ * When the underlying cycle counter runs over, this will be handled
+ * correctly as long as it does not run over more than once between
+ * calls.
+ *
+ * The first call to this function for a new time counter initializes
+ * the time tracking and returns an undefined result.
+ */
+static u64 timecounter_read_delta(struct timecounter *tc)
+{
+	cycle_t cycle_now, cycle_delta;
+	u64 ns_offset;
+
+	/* read cycle counter: */
+	cycle_now = tc->cc->read(tc->cc);
+
+	/* calculate the delta since the last timecounter_read_delta(): */
+	cycle_delta = (cycle_now - tc->cycle_last) & tc->cc->mask;
+
+	/* convert to nanoseconds: */
+	ns_offset = cyclecounter_cyc2ns(tc->cc, cycle_delta);
+
+	/* update time stamp of timecounter_read_delta() call: */
+	tc->cycle_last = cycle_now;
+
+	return ns_offset;
+}
+
+u64 timecounter_read(struct timecounter *tc)
+{
+	u64 nsec;
+
+	/* increment time by nanoseconds since last call */
+	nsec = timecounter_read_delta(tc);
+	nsec += tc->nsec;
+	tc->nsec = nsec;
+
+	return nsec;
+}
+EXPORT_SYMBOL_GPL(timecounter_read);
+
+u64 timecounter_cyc2time(struct timecounter *tc,
+			 cycle_t cycle_tstamp)
+{
+	u64 cycle_delta = (cycle_tstamp - tc->cycle_last) & tc->cc->mask;
+	u64 nsec;
+
+	/*
+	 * Instead of always treating cycle_tstamp as more recent
+	 * than tc->cycle_last, detect when it is too far in the
+	 * future and treat it as old time stamp instead.
+	 */
+	if (cycle_delta > tc->cc->mask / 2) {
+		cycle_delta = (tc->cycle_last - cycle_tstamp) & tc->cc->mask;
+		nsec = tc->nsec - cyclecounter_cyc2ns(tc->cc, cycle_delta);
+	} else {
+		nsec = cyclecounter_cyc2ns(tc->cc, cycle_delta) + tc->nsec;
+	}
+
+	return nsec;
+}
+EXPORT_SYMBOL_GPL(timecounter_cyc2time);
diff --git a/sound/pci/hda/hda_priv.h b/sound/pci/hda/hda_priv.h
index aa484fd..3fc7cfe 100644
--- a/sound/pci/hda/hda_priv.h
+++ b/sound/pci/hda/hda_priv.h
@@ -15,7 +15,7 @@
 #ifndef __SOUND_HDA_PRIV_H
 #define __SOUND_HDA_PRIV_H
 
-#include <linux/clocksource.h>
+#include <linux/timecounter.h>
 #include <sound/core.h>
 #include <sound/pcm.h>
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 03/11] net: xgbe: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 18:46 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

This patch changes the driver to use the new and improved method
for adjusting the offset of a timecounter.

Compile tested only.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/amd/xgbe/xgbe-ptp.c |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-ptp.c b/drivers/net/ethernet/amd/xgbe/xgbe-ptp.c
index a1bf9d1c..f5acf4c 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-ptp.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-ptp.c
@@ -171,15 +171,9 @@ static int xgbe_adjtime(struct ptp_clock_info *info, s64 delta)
 						   struct xgbe_prv_data,
 						   ptp_clock_info);
 	unsigned long flags;
-	u64 nsec;
 
 	spin_lock_irqsave(&pdata->tstamp_lock, flags);
-
-	nsec = timecounter_read(&pdata->tstamp_tc);
-
-	nsec += delta;
-	timecounter_init(&pdata->tstamp_tc, &pdata->tstamp_cc, nsec);
-
+	timecounter_adjtime(&pdata->tstamp_tc, delta);
 	spin_unlock_irqrestore(&pdata->tstamp_lock, flags);
 
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 11/11] timecounter: keep track of accumulated fractional nanoseconds
From: Richard Cochran @ 2014-12-21 18:47 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <cover.1418504883.git.richardcochran@gmail.com>

The current timecounter implementation will drop a variable amount
of resolution, depending on the magnitude of the time delta. In
other words, reading the clock too often or too close to a time
stamp conversion will introduce errors into the time values. This
patch fixes the issue by introducing a fractional nanosecond field
that accumulates the low order bits.

Reported-by: Janusz Użycki <j.uzycki@elproma.com.pl>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_clock.c |    4 ++--
 include/linux/timecounter.h                   |   19 +++++++++------
 kernel/time/timecounter.c                     |   31 +++++++++++++++++++------
 virt/kvm/arm/arch_timer.c                     |    3 ++-
 4 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_clock.c b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
index df35d0e..e9cce4f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
@@ -240,7 +240,7 @@ void mlx4_en_init_timestamp(struct mlx4_en_dev *mdev)
 {
 	struct mlx4_dev *dev = mdev->dev;
 	unsigned long flags;
-	u64 ns;
+	u64 ns, zero = 0;
 
 	rwlock_init(&mdev->clock_lock);
 
@@ -265,7 +265,7 @@ void mlx4_en_init_timestamp(struct mlx4_en_dev *mdev)
 	/* Calculate period in seconds to call the overflow watchdog - to make
 	 * sure counter is checked at least once every wrap around.
 	 */
-	ns = cyclecounter_cyc2ns(&mdev->cycles, mdev->cycles.mask);
+	ns = cyclecounter_cyc2ns(&mdev->cycles, mdev->cycles.mask, zero, &zero);
 	do_div(ns, NSEC_PER_SEC / 2 / HZ);
 	mdev->overflow_period = ns;
 
diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h
index af3dfa4..74f4549 100644
--- a/include/linux/timecounter.h
+++ b/include/linux/timecounter.h
@@ -55,27 +55,32 @@ struct cyclecounter {
  * @cycle_last:		most recent cycle counter value seen by
  *			timecounter_read()
  * @nsec:		continuously increasing count
+ * @mask:		bit mask for maintaining the 'frac' field
+ * @frac:		accumulated fractional nanoseconds
  */
 struct timecounter {
 	const struct cyclecounter *cc;
 	cycle_t cycle_last;
 	u64 nsec;
+	u64 mask;
+	u64 frac;
 };
 
 /**
  * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds
  * @cc:		Pointer to cycle counter.
  * @cycles:	Cycles
- *
- * XXX - This could use some mult_lxl_ll() asm optimization. Same code
- * as in cyc2ns, but with unsigned result.
+ * @mask:	bit mask for maintaining the 'frac' field
+ * @frac:	pointer to storage for the fractional nanoseconds.
  */
 static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc,
-				      cycle_t cycles)
+				      cycle_t cycles, u64 mask, u64 *frac)
 {
-	u64 ret = (u64)cycles;
-	ret = (ret * cc->mult) >> cc->shift;
-	return ret;
+	u64 ns = (u64) cycles;
+
+	ns = (ns * cc->mult) + *frac;
+	*frac = ns & mask;
+	return ns >> cc->shift;
 }
 
 /**
diff --git a/kernel/time/timecounter.c b/kernel/time/timecounter.c
index 59a1ec3..4687b31 100644
--- a/kernel/time/timecounter.c
+++ b/kernel/time/timecounter.c
@@ -25,6 +25,8 @@ void timecounter_init(struct timecounter *tc,
 	tc->cc = cc;
 	tc->cycle_last = cc->read(cc);
 	tc->nsec = start_tstamp;
+	tc->mask = (1ULL << cc->shift) - 1;
+	tc->frac = 0;
 }
 EXPORT_SYMBOL_GPL(timecounter_init);
 
@@ -51,7 +53,8 @@ static u64 timecounter_read_delta(struct timecounter *tc)
 	cycle_delta = (cycle_now - tc->cycle_last) & tc->cc->mask;
 
 	/* convert to nanoseconds: */
-	ns_offset = cyclecounter_cyc2ns(tc->cc, cycle_delta);
+	ns_offset = cyclecounter_cyc2ns(tc->cc, cycle_delta,
+					tc->mask, &tc->frac);
 
 	/* update time stamp of timecounter_read_delta() call: */
 	tc->cycle_last = cycle_now;
@@ -72,22 +75,36 @@ u64 timecounter_read(struct timecounter *tc)
 }
 EXPORT_SYMBOL_GPL(timecounter_read);
 
+/*
+ * This is like cyclecounter_cyc2ns(), but it is used for computing a
+ * time previous to the time stored in the cycle counter.
+ */
+static u64 cc_cyc2ns_backwards(const struct cyclecounter *cc,
+			       cycle_t cycles, u64 mask, u64 frac)
+{
+	u64 ns = (u64) cycles;
+
+	ns = ((ns * cc->mult) - frac) >> cc->shift;
+
+	return ns;
+}
+
 u64 timecounter_cyc2time(struct timecounter *tc,
 			 cycle_t cycle_tstamp)
 {
-	u64 cycle_delta = (cycle_tstamp - tc->cycle_last) & tc->cc->mask;
-	u64 nsec;
+	u64 delta = (cycle_tstamp - tc->cycle_last) & tc->cc->mask;
+	u64 nsec = tc->nsec, frac = tc->frac;
 
 	/*
 	 * Instead of always treating cycle_tstamp as more recent
 	 * than tc->cycle_last, detect when it is too far in the
 	 * future and treat it as old time stamp instead.
 	 */
-	if (cycle_delta > tc->cc->mask / 2) {
-		cycle_delta = (tc->cycle_last - cycle_tstamp) & tc->cc->mask;
-		nsec = tc->nsec - cyclecounter_cyc2ns(tc->cc, cycle_delta);
+	if (delta > tc->cc->mask / 2) {
+		delta = (tc->cycle_last - cycle_tstamp) & tc->cc->mask;
+		nsec -= cc_cyc2ns_backwards(tc->cc, delta, tc->mask, frac);
 	} else {
-		nsec = cyclecounter_cyc2ns(tc->cc, cycle_delta) + tc->nsec;
+		nsec += cyclecounter_cyc2ns(tc->cc, delta, tc->mask, &frac);
 	}
 
 	return nsec;
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 22fa819..75d9564 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -150,7 +150,8 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 		return;
 	}
 
-	ns = cyclecounter_cyc2ns(timecounter->cc, cval - now);
+	ns = cyclecounter_cyc2ns(timecounter->cc, cval - now, timecounter->mask,
+				 &timecounter->frac);
 	timer_arm(timer, ns);
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH net-next 04/11] net: bnx2x: convert to timecounter adjtime.
From: Richard Cochran @ 2014-12-21 19:01 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, Amir Vadai, Ariel Elior, Carolyn Wyborny,
	David Miller, Frank Li, Jeff Kirsher, John Stultz, Matthew Vick,
	Miroslav Lichvar, Mugunthan V N, Or Gerlitz, Thomas Gleixner,
	Tom Lendacky
In-Reply-To: <976abf89c833c376c2666def2c60a7de0f020026.1418504889.git.richardcochran@gmail.com>

On Sun, Dec 21, 2014 at 07:46:59PM +0100, Richard Cochran wrote:
> This patch changes the driver to use the new and improved method
> for adjusting the offset of a timecounter.

BTW, I noticed that this driver does not serialize access to the
timecounter in any way, but probably it should. Ariel?

Thanks,
Richard

^ permalink raw reply

* Re: SRIOV as bridge Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Roopa Prabhu @ 2014-12-21 19:08 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: John Fastabend, Hubert Sokolowski, netdev@vger.kernel.org,
	Vlad Yasevich, Shrijeet Mukherjee
In-Reply-To: <5496D8E2.1090700@mojatatu.com>

On 12/21/14, 6:27 AM, Jamal Hadi Salim wrote:
>
> Sorry for the latency, Ive been down with a bad flu (its bad when i cant
> type on my keyboard sitting infront of me;->),  recovering and the
> thread seems to have caught on - should be able to catchup in the
> next few days.
> I am beginning to reach a conclusion that the current switchdev approach
> is *not* going to work for SRIOV. I also worry it may be too late
> to change that.
> Shrijeet wanted to set up a BOF for netdev to have hopefully final 
> consensus. Shrijeet, are you going to make an official request for the 
> BOF?
>
> Sorry John, I dont have enough energy to address all your points but i
> will try to just focus on SRIOV and will save a few bytes while at it.
>
>
> On 12/16/14 11:35, John Fastabend wrote:
>
>> But in the SR-IOV case you have multiple "Cpu ports" and you want
>> to send packets to each of them depending on the configuration.
>>
>>
>>     port0   port1     port2  port3
>>      |        |        |      |      uplinks
>>   +------------------------------+
>>   |                              |
>>   |       SRIOV edge relay       |
>>   |                              |
>>   +------------------------------+
>>                   |                   downlink
>>
>>
>
> Two points above:
> 1) Did you flip uplink vs down link above?
> (I Thought URP was the wire link)
> 2) What you are not showing above which is *very important* is that
> infact there is an underlying embedded fdb.
>
> point #2 brings out a lot of the weird things in some of the bridge
> code. IOW, you have an *offloaded* bridge with _bridge ports_
> visible in the kernel but not the bridge that is controlled
> by standard Linux bridge tools. I am not saying that the model is
> wrong; on the contrary what Ben had exposed may fall under the
> same category i.e you have E_BRIDGE flag on the netdev to say it sits
> on top of an offloaded bridge and you dont need a br0 to run
> bridge command on. But then we need some proxy (TheClassThingy) to act
> as intemediary to the offloaded hardware.
> If you do that then the vf becomes simply a bridge port - which
> means bridge port ops apply.
>
> SRIOV it seems to have morphed its own toolkit.
> The PF port, when acting as the control interface, is actually
> TheClassThingy we discuss on/off.
> To add an fdb entry to point to vf 1, where TheClassThingy is eth1:
> ip link set eth1 vf 1 mac aa:bb:cc:dd:ee:ff vlan 10
>
> IMO, SRIOV should expose these ports with names and ifindices
> (probably does already) and pre-populated master or something
> which points to its parent, then i can do the following:
> bridge fdb add aa:bb:cc:dd:ee:ff vlan 10 dev vf1 master
I had a slightly different understanding of how this would work for 
SRIOV. So, am attempting to respond to your questions for John..., ...so 
that he can correct my understanding too ..if needed :).

I think SRIOV VF's do have netdevs (John can confirm, I maybe wrong). To 
me if SRIOV has a single fdb for all VF's under a PF,
and it wants to bypass the bridge driver, there is still no reason to 
refer to the PF as a master.
You can use self and go to the vf driver directly and it will do the 
right thing.

bridge fdb add aa:bb:cc:dd:ee:ff vlan 10 dev vf1 self

>
> master in such a case will go to TheClassThingy which would pass
> such control to the underlying hardware.
> The PF still stays but not as the management interface.

Even if 'TheClassThingy' where there, you wouldn't refer to it as the 
master (ie the PF will not have a netdev master/slave relationship with 
the VF). 'master' will still be used for the netdev 'upper'  device if 
VF was enslaved to one (which could be a bridge).


Thanks,
Roopa

^ permalink raw reply

* Re: SRIOV as bridge Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Jamal Hadi Salim @ 2014-12-21 19:19 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: John Fastabend, Hubert Sokolowski, netdev@vger.kernel.org,
	Vlad Yasevich, Shrijeet Mukherjee
In-Reply-To: <54971A93.6050700@cumulusnetworks.com>

On 12/21/14 14:08, Roopa Prabhu wrote:
> PF still stays but not as the management interface.
>
> Even if 'TheClassThingy' where there, you wouldn't refer to it as the
> master (ie the PF will not have a netdev master/slave relationship with
> the VF). 'master' will still be used for the netdev 'upper'  device if
> VF was enslaved to one (which could be a bridge).
>

Well, there is an embedded switch underneath the VFs (in hardware).
You cant send pkts from one VF to another without going through this
switch (or in VEPA mode via it). i.e you dont need a kernel bridge.
So in essence the VF is a bridge port to  this embedded switch (as
is the PF). So the role of master points downwards from the kernel.
Master is just not visible at the kernel. I am not sure what "self"
would mean in this case.
This is why i dont think current switchdev approach would work.

cheers,
jamal

^ permalink raw reply

* [RESEND PATCH] net: s6gmac: remove driver
From: Daniel Glöckner @ 2014-12-21 19:27 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-kernel, Daniel Glöckner
In-Reply-To: <20141111.113007.194586174190893576.davem@davemloft.net>

The s6000 Xtensa support has been removed from the kernel in
4006e565e1500db4. There are no other chips using this driver.

While the Mentor/Alcatel PE-MCXMAC IP core is also used in other
designs (Freescale Gianfar/UCC, QLogic NetXen, Solarflare, Agere
ET-1310, Netlogic XLR/XLS), none of these use this driver as it
heavily depends on the s6000 DMA engine. In fact, there is no
code sharing across any of the aforementioned devices.

Signed-off-by: Daniel Glöckner <dg@emlix.com>
---
 drivers/net/ethernet/Kconfig  |   12 -
 drivers/net/ethernet/Makefile |    1 -
 drivers/net/ethernet/s6gmac.c | 1058 -----------------------------------------
 3 files changed, 1071 deletions(-)
 delete mode 100644 drivers/net/ethernet/s6gmac.c

diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index df76050..eadcb05 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -156,18 +156,6 @@ source "drivers/net/ethernet/realtek/Kconfig"
 source "drivers/net/ethernet/renesas/Kconfig"
 source "drivers/net/ethernet/rdc/Kconfig"
 source "drivers/net/ethernet/rocker/Kconfig"
-
-config S6GMAC
-	tristate "S6105 GMAC ethernet support"
-	depends on XTENSA_VARIANT_S6000
-	select PHYLIB
-	---help---
-	  This driver supports the on chip ethernet device on the
-	  S6105 xtensa processor.
-
-	  To compile this driver as a module, choose M here. The module
-	  will be called s6gmac.
-
 source "drivers/net/ethernet/samsung/Kconfig"
 source "drivers/net/ethernet/seeq/Kconfig"
 source "drivers/net/ethernet/silan/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index bf56f8b..1367afc 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -66,7 +66,6 @@ obj-$(CONFIG_NET_VENDOR_REALTEK) += realtek/
 obj-$(CONFIG_SH_ETH) += renesas/
 obj-$(CONFIG_NET_VENDOR_RDC) += rdc/
 obj-$(CONFIG_NET_VENDOR_ROCKER) += rocker/
-obj-$(CONFIG_S6GMAC) += s6gmac.o
 obj-$(CONFIG_NET_VENDOR_SAMSUNG) += samsung/
 obj-$(CONFIG_NET_VENDOR_SEEQ) += seeq/
 obj-$(CONFIG_NET_VENDOR_SILAN) += silan/
diff --git a/drivers/net/ethernet/s6gmac.c b/drivers/net/ethernet/s6gmac.c
deleted file mode 100644
index f537cbe..0000000
--- a/drivers/net/ethernet/s6gmac.c
+++ /dev/null
@@ -1,1058 +0,0 @@
-/*
- * Ethernet driver for S6105 on chip network device
- * (c)2008 emlix GmbH http://www.emlix.com
- * Authors:	Oskar Schirmer <oskar@scara.com>
- *		Daniel Gloeckner <dg@emlix.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/interrupt.h>
-#include <linux/types.h>
-#include <linux/delay.h>
-#include <linux/spinlock.h>
-#include <linux/netdevice.h>
-#include <linux/etherdevice.h>
-#include <linux/if.h>
-#include <linux/stddef.h>
-#include <linux/mii.h>
-#include <linux/phy.h>
-#include <linux/platform_device.h>
-#include <variant/hardware.h>
-#include <variant/dmac.h>
-
-#define DRV_NAME "s6gmac"
-#define DRV_PRMT DRV_NAME ": "
-
-
-/* register declarations */
-
-#define S6_GMAC_MACCONF1	0x000
-#define S6_GMAC_MACCONF1_TXENA		0
-#define S6_GMAC_MACCONF1_SYNCTX		1
-#define S6_GMAC_MACCONF1_RXENA		2
-#define S6_GMAC_MACCONF1_SYNCRX		3
-#define S6_GMAC_MACCONF1_TXFLOWCTRL	4
-#define S6_GMAC_MACCONF1_RXFLOWCTRL	5
-#define S6_GMAC_MACCONF1_LOOPBACK	8
-#define S6_GMAC_MACCONF1_RESTXFUNC	16
-#define S6_GMAC_MACCONF1_RESRXFUNC	17
-#define S6_GMAC_MACCONF1_RESTXMACCTRL	18
-#define S6_GMAC_MACCONF1_RESRXMACCTRL	19
-#define S6_GMAC_MACCONF1_SIMULRES	30
-#define S6_GMAC_MACCONF1_SOFTRES	31
-#define S6_GMAC_MACCONF2	0x004
-#define S6_GMAC_MACCONF2_FULL		0
-#define S6_GMAC_MACCONF2_CRCENA		1
-#define S6_GMAC_MACCONF2_PADCRCENA	2
-#define S6_GMAC_MACCONF2_LENGTHFCHK	4
-#define S6_GMAC_MACCONF2_HUGEFRAMENA	5
-#define S6_GMAC_MACCONF2_IFMODE		8
-#define S6_GMAC_MACCONF2_IFMODE_NIBBLE		1
-#define S6_GMAC_MACCONF2_IFMODE_BYTE		2
-#define S6_GMAC_MACCONF2_IFMODE_MASK		3
-#define S6_GMAC_MACCONF2_PREAMBLELEN	12
-#define S6_GMAC_MACCONF2_PREAMBLELEN_MASK	0x0F
-#define S6_GMAC_MACIPGIFG	0x008
-#define S6_GMAC_MACIPGIFG_B2BINTERPGAP	0
-#define S6_GMAC_MACIPGIFG_B2BINTERPGAP_MASK	0x7F
-#define S6_GMAC_MACIPGIFG_MINIFGENFORCE	8
-#define S6_GMAC_MACIPGIFG_B2BINTERPGAP2	16
-#define S6_GMAC_MACIPGIFG_B2BINTERPGAP1	24
-#define S6_GMAC_MACHALFDUPLEX	0x00C
-#define S6_GMAC_MACHALFDUPLEX_COLLISWIN	0
-#define S6_GMAC_MACHALFDUPLEX_COLLISWIN_MASK	0x3F
-#define S6_GMAC_MACHALFDUPLEX_RETXMAX	12
-#define S6_GMAC_MACHALFDUPLEX_RETXMAX_MASK	0x0F
-#define S6_GMAC_MACHALFDUPLEX_EXCESSDEF	16
-#define S6_GMAC_MACHALFDUPLEX_NOBACKOFF	17
-#define S6_GMAC_MACHALFDUPLEX_BPNOBCKOF	18
-#define S6_GMAC_MACHALFDUPLEX_ALTBEBENA	19
-#define S6_GMAC_MACHALFDUPLEX_ALTBEBTRN	20
-#define S6_GMAC_MACHALFDUPLEX_ALTBEBTR_MASK	0x0F
-#define S6_GMAC_MACMAXFRAMELEN	0x010
-#define S6_GMAC_MACMIICONF	0x020
-#define S6_GMAC_MACMIICONF_CSEL		0
-#define S6_GMAC_MACMIICONF_CSEL_DIV10		0
-#define S6_GMAC_MACMIICONF_CSEL_DIV12		1
-#define S6_GMAC_MACMIICONF_CSEL_DIV14		2
-#define S6_GMAC_MACMIICONF_CSEL_DIV18		3
-#define S6_GMAC_MACMIICONF_CSEL_DIV24		4
-#define S6_GMAC_MACMIICONF_CSEL_DIV34		5
-#define S6_GMAC_MACMIICONF_CSEL_DIV68		6
-#define S6_GMAC_MACMIICONF_CSEL_DIV168		7
-#define S6_GMAC_MACMIICONF_CSEL_MASK		7
-#define S6_GMAC_MACMIICONF_PREAMBLESUPR	4
-#define S6_GMAC_MACMIICONF_SCANAUTOINCR	5
-#define S6_GMAC_MACMIICMD	0x024
-#define S6_GMAC_MACMIICMD_READ		0
-#define S6_GMAC_MACMIICMD_SCAN		1
-#define S6_GMAC_MACMIIADDR	0x028
-#define S6_GMAC_MACMIIADDR_REG		0
-#define S6_GMAC_MACMIIADDR_REG_MASK		0x1F
-#define S6_GMAC_MACMIIADDR_PHY		8
-#define S6_GMAC_MACMIIADDR_PHY_MASK		0x1F
-#define S6_GMAC_MACMIICTRL	0x02C
-#define S6_GMAC_MACMIISTAT	0x030
-#define S6_GMAC_MACMIIINDI	0x034
-#define S6_GMAC_MACMIIINDI_BUSY		0
-#define S6_GMAC_MACMIIINDI_SCAN		1
-#define S6_GMAC_MACMIIINDI_INVAL	2
-#define S6_GMAC_MACINTERFSTAT	0x03C
-#define S6_GMAC_MACINTERFSTAT_LINKFAIL	3
-#define S6_GMAC_MACINTERFSTAT_EXCESSDEF	9
-#define S6_GMAC_MACSTATADDR1	0x040
-#define S6_GMAC_MACSTATADDR2	0x044
-
-#define S6_GMAC_FIFOCONF0	0x048
-#define S6_GMAC_FIFOCONF0_HSTRSTWT	0
-#define S6_GMAC_FIFOCONF0_HSTRSTSR	1
-#define S6_GMAC_FIFOCONF0_HSTRSTFR	2
-#define S6_GMAC_FIFOCONF0_HSTRSTST	3
-#define S6_GMAC_FIFOCONF0_HSTRSTFT	4
-#define S6_GMAC_FIFOCONF0_WTMENREQ	8
-#define S6_GMAC_FIFOCONF0_SRFENREQ	9
-#define S6_GMAC_FIFOCONF0_FRFENREQ	10
-#define S6_GMAC_FIFOCONF0_STFENREQ	11
-#define S6_GMAC_FIFOCONF0_FTFENREQ	12
-#define S6_GMAC_FIFOCONF0_WTMENRPLY	16
-#define S6_GMAC_FIFOCONF0_SRFENRPLY	17
-#define S6_GMAC_FIFOCONF0_FRFENRPLY	18
-#define S6_GMAC_FIFOCONF0_STFENRPLY	19
-#define S6_GMAC_FIFOCONF0_FTFENRPLY	20
-#define S6_GMAC_FIFOCONF1	0x04C
-#define S6_GMAC_FIFOCONF2	0x050
-#define S6_GMAC_FIFOCONF2_CFGLWM	0
-#define S6_GMAC_FIFOCONF2_CFGHWM	16
-#define S6_GMAC_FIFOCONF3	0x054
-#define S6_GMAC_FIFOCONF3_CFGFTTH	0
-#define S6_GMAC_FIFOCONF3_CFGHWMFT	16
-#define S6_GMAC_FIFOCONF4	0x058
-#define S6_GMAC_FIFOCONF_RSV_PREVDROP	0
-#define S6_GMAC_FIFOCONF_RSV_RUNT	1
-#define S6_GMAC_FIFOCONF_RSV_FALSECAR	2
-#define S6_GMAC_FIFOCONF_RSV_CODEERR	3
-#define S6_GMAC_FIFOCONF_RSV_CRCERR	4
-#define S6_GMAC_FIFOCONF_RSV_LENGTHERR	5
-#define S6_GMAC_FIFOCONF_RSV_LENRANGE	6
-#define S6_GMAC_FIFOCONF_RSV_OK		7
-#define S6_GMAC_FIFOCONF_RSV_MULTICAST	8
-#define S6_GMAC_FIFOCONF_RSV_BROADCAST	9
-#define S6_GMAC_FIFOCONF_RSV_DRIBBLE	10
-#define S6_GMAC_FIFOCONF_RSV_CTRLFRAME	11
-#define S6_GMAC_FIFOCONF_RSV_PAUSECTRL	12
-#define S6_GMAC_FIFOCONF_RSV_UNOPCODE	13
-#define S6_GMAC_FIFOCONF_RSV_VLANTAG	14
-#define S6_GMAC_FIFOCONF_RSV_LONGEVENT	15
-#define S6_GMAC_FIFOCONF_RSV_TRUNCATED	16
-#define S6_GMAC_FIFOCONF_RSV_MASK		0x3FFFF
-#define S6_GMAC_FIFOCONF5	0x05C
-#define S6_GMAC_FIFOCONF5_DROPLT64	18
-#define S6_GMAC_FIFOCONF5_CFGBYTM	19
-#define S6_GMAC_FIFOCONF5_RXDROPSIZE	20
-#define S6_GMAC_FIFOCONF5_RXDROPSIZE_MASK	0xF
-
-#define S6_GMAC_STAT_REGS	0x080
-#define S6_GMAC_STAT_SIZE_MIN		12
-#define S6_GMAC_STATTR64	0x080
-#define S6_GMAC_STATTR64_SIZE		18
-#define S6_GMAC_STATTR127	0x084
-#define S6_GMAC_STATTR127_SIZE		18
-#define S6_GMAC_STATTR255	0x088
-#define S6_GMAC_STATTR255_SIZE		18
-#define S6_GMAC_STATTR511	0x08C
-#define S6_GMAC_STATTR511_SIZE		18
-#define S6_GMAC_STATTR1K	0x090
-#define S6_GMAC_STATTR1K_SIZE		18
-#define S6_GMAC_STATTRMAX	0x094
-#define S6_GMAC_STATTRMAX_SIZE		18
-#define S6_GMAC_STATTRMGV	0x098
-#define S6_GMAC_STATTRMGV_SIZE		18
-#define S6_GMAC_STATRBYT	0x09C
-#define S6_GMAC_STATRBYT_SIZE		24
-#define S6_GMAC_STATRPKT	0x0A0
-#define S6_GMAC_STATRPKT_SIZE		18
-#define S6_GMAC_STATRFCS	0x0A4
-#define S6_GMAC_STATRFCS_SIZE		12
-#define S6_GMAC_STATRMCA	0x0A8
-#define S6_GMAC_STATRMCA_SIZE		18
-#define S6_GMAC_STATRBCA	0x0AC
-#define S6_GMAC_STATRBCA_SIZE		22
-#define S6_GMAC_STATRXCF	0x0B0
-#define S6_GMAC_STATRXCF_SIZE		18
-#define S6_GMAC_STATRXPF	0x0B4
-#define S6_GMAC_STATRXPF_SIZE		12
-#define S6_GMAC_STATRXUO	0x0B8
-#define S6_GMAC_STATRXUO_SIZE		12
-#define S6_GMAC_STATRALN	0x0BC
-#define S6_GMAC_STATRALN_SIZE		12
-#define S6_GMAC_STATRFLR	0x0C0
-#define S6_GMAC_STATRFLR_SIZE		16
-#define S6_GMAC_STATRCDE	0x0C4
-#define S6_GMAC_STATRCDE_SIZE		12
-#define S6_GMAC_STATRCSE	0x0C8
-#define S6_GMAC_STATRCSE_SIZE		12
-#define S6_GMAC_STATRUND	0x0CC
-#define S6_GMAC_STATRUND_SIZE		12
-#define S6_GMAC_STATROVR	0x0D0
-#define S6_GMAC_STATROVR_SIZE		12
-#define S6_GMAC_STATRFRG	0x0D4
-#define S6_GMAC_STATRFRG_SIZE		12
-#define S6_GMAC_STATRJBR	0x0D8
-#define S6_GMAC_STATRJBR_SIZE		12
-#define S6_GMAC_STATRDRP	0x0DC
-#define S6_GMAC_STATRDRP_SIZE		12
-#define S6_GMAC_STATTBYT	0x0E0
-#define S6_GMAC_STATTBYT_SIZE		24
-#define S6_GMAC_STATTPKT	0x0E4
-#define S6_GMAC_STATTPKT_SIZE		18
-#define S6_GMAC_STATTMCA	0x0E8
-#define S6_GMAC_STATTMCA_SIZE		18
-#define S6_GMAC_STATTBCA	0x0EC
-#define S6_GMAC_STATTBCA_SIZE		18
-#define S6_GMAC_STATTXPF	0x0F0
-#define S6_GMAC_STATTXPF_SIZE		12
-#define S6_GMAC_STATTDFR	0x0F4
-#define S6_GMAC_STATTDFR_SIZE		12
-#define S6_GMAC_STATTEDF	0x0F8
-#define S6_GMAC_STATTEDF_SIZE		12
-#define S6_GMAC_STATTSCL	0x0FC
-#define S6_GMAC_STATTSCL_SIZE		12
-#define S6_GMAC_STATTMCL	0x100
-#define S6_GMAC_STATTMCL_SIZE		12
-#define S6_GMAC_STATTLCL	0x104
-#define S6_GMAC_STATTLCL_SIZE		12
-#define S6_GMAC_STATTXCL	0x108
-#define S6_GMAC_STATTXCL_SIZE		12
-#define S6_GMAC_STATTNCL	0x10C
-#define S6_GMAC_STATTNCL_SIZE		13
-#define S6_GMAC_STATTPFH	0x110
-#define S6_GMAC_STATTPFH_SIZE		12
-#define S6_GMAC_STATTDRP	0x114
-#define S6_GMAC_STATTDRP_SIZE		12
-#define S6_GMAC_STATTJBR	0x118
-#define S6_GMAC_STATTJBR_SIZE		12
-#define S6_GMAC_STATTFCS	0x11C
-#define S6_GMAC_STATTFCS_SIZE		12
-#define S6_GMAC_STATTXCF	0x120
-#define S6_GMAC_STATTXCF_SIZE		12
-#define S6_GMAC_STATTOVR	0x124
-#define S6_GMAC_STATTOVR_SIZE		12
-#define S6_GMAC_STATTUND	0x128
-#define S6_GMAC_STATTUND_SIZE		12
-#define S6_GMAC_STATTFRG	0x12C
-#define S6_GMAC_STATTFRG_SIZE		12
-#define S6_GMAC_STATCARRY(n)	(0x130 + 4*(n))
-#define S6_GMAC_STATCARRYMSK(n)	(0x138 + 4*(n))
-#define S6_GMAC_STATCARRY1_RDRP		0
-#define S6_GMAC_STATCARRY1_RJBR		1
-#define S6_GMAC_STATCARRY1_RFRG		2
-#define S6_GMAC_STATCARRY1_ROVR		3
-#define S6_GMAC_STATCARRY1_RUND		4
-#define S6_GMAC_STATCARRY1_RCSE		5
-#define S6_GMAC_STATCARRY1_RCDE		6
-#define S6_GMAC_STATCARRY1_RFLR		7
-#define S6_GMAC_STATCARRY1_RALN		8
-#define S6_GMAC_STATCARRY1_RXUO		9
-#define S6_GMAC_STATCARRY1_RXPF		10
-#define S6_GMAC_STATCARRY1_RXCF		11
-#define S6_GMAC_STATCARRY1_RBCA		12
-#define S6_GMAC_STATCARRY1_RMCA		13
-#define S6_GMAC_STATCARRY1_RFCS		14
-#define S6_GMAC_STATCARRY1_RPKT		15
-#define S6_GMAC_STATCARRY1_RBYT		16
-#define S6_GMAC_STATCARRY1_TRMGV	25
-#define S6_GMAC_STATCARRY1_TRMAX	26
-#define S6_GMAC_STATCARRY1_TR1K		27
-#define S6_GMAC_STATCARRY1_TR511	28
-#define S6_GMAC_STATCARRY1_TR255	29
-#define S6_GMAC_STATCARRY1_TR127	30
-#define S6_GMAC_STATCARRY1_TR64		31
-#define S6_GMAC_STATCARRY2_TDRP		0
-#define S6_GMAC_STATCARRY2_TPFH		1
-#define S6_GMAC_STATCARRY2_TNCL		2
-#define S6_GMAC_STATCARRY2_TXCL		3
-#define S6_GMAC_STATCARRY2_TLCL		4
-#define S6_GMAC_STATCARRY2_TMCL		5
-#define S6_GMAC_STATCARRY2_TSCL		6
-#define S6_GMAC_STATCARRY2_TEDF		7
-#define S6_GMAC_STATCARRY2_TDFR		8
-#define S6_GMAC_STATCARRY2_TXPF		9
-#define S6_GMAC_STATCARRY2_TBCA		10
-#define S6_GMAC_STATCARRY2_TMCA		11
-#define S6_GMAC_STATCARRY2_TPKT		12
-#define S6_GMAC_STATCARRY2_TBYT		13
-#define S6_GMAC_STATCARRY2_TFRG		14
-#define S6_GMAC_STATCARRY2_TUND		15
-#define S6_GMAC_STATCARRY2_TOVR		16
-#define S6_GMAC_STATCARRY2_TXCF		17
-#define S6_GMAC_STATCARRY2_TFCS		18
-#define S6_GMAC_STATCARRY2_TJBR		19
-
-#define S6_GMAC_HOST_PBLKCTRL	0x140
-#define S6_GMAC_HOST_PBLKCTRL_TXENA	0
-#define S6_GMAC_HOST_PBLKCTRL_RXENA	1
-#define S6_GMAC_HOST_PBLKCTRL_TXSRES	2
-#define S6_GMAC_HOST_PBLKCTRL_RXSRES	3
-#define S6_GMAC_HOST_PBLKCTRL_TXBSIZ	8
-#define S6_GMAC_HOST_PBLKCTRL_RXBSIZ	12
-#define S6_GMAC_HOST_PBLKCTRL_SIZ_16		4
-#define S6_GMAC_HOST_PBLKCTRL_SIZ_32		5
-#define S6_GMAC_HOST_PBLKCTRL_SIZ_64		6
-#define S6_GMAC_HOST_PBLKCTRL_SIZ_128		7
-#define S6_GMAC_HOST_PBLKCTRL_SIZ_MASK		0xF
-#define S6_GMAC_HOST_PBLKCTRL_STATENA	16
-#define S6_GMAC_HOST_PBLKCTRL_STATAUTOZ	17
-#define S6_GMAC_HOST_PBLKCTRL_STATCLEAR	18
-#define S6_GMAC_HOST_PBLKCTRL_RGMII	19
-#define S6_GMAC_HOST_INTMASK	0x144
-#define S6_GMAC_HOST_INTSTAT	0x148
-#define S6_GMAC_HOST_INT_TXBURSTOVER	3
-#define S6_GMAC_HOST_INT_TXPREWOVER	4
-#define S6_GMAC_HOST_INT_RXBURSTUNDER	5
-#define S6_GMAC_HOST_INT_RXPOSTRFULL	6
-#define S6_GMAC_HOST_INT_RXPOSTRUNDER	7
-#define S6_GMAC_HOST_RXFIFOHWM	0x14C
-#define S6_GMAC_HOST_CTRLFRAMXP	0x150
-#define S6_GMAC_HOST_DSTADDRLO(n) (0x160 + 8*(n))
-#define S6_GMAC_HOST_DSTADDRHI(n) (0x164 + 8*(n))
-#define S6_GMAC_HOST_DSTMASKLO(n) (0x180 + 8*(n))
-#define S6_GMAC_HOST_DSTMASKHI(n) (0x184 + 8*(n))
-
-#define S6_GMAC_BURST_PREWR	0x1B0
-#define S6_GMAC_BURST_PREWR_LEN		0
-#define S6_GMAC_BURST_PREWR_LEN_MASK		((1 << 20) - 1)
-#define S6_GMAC_BURST_PREWR_CFE		20
-#define S6_GMAC_BURST_PREWR_PPE		21
-#define S6_GMAC_BURST_PREWR_FCS		22
-#define S6_GMAC_BURST_PREWR_PAD		23
-#define S6_GMAC_BURST_POSTRD	0x1D0
-#define S6_GMAC_BURST_POSTRD_LEN	0
-#define S6_GMAC_BURST_POSTRD_LEN_MASK		((1 << 20) - 1)
-#define S6_GMAC_BURST_POSTRD_DROP	20
-
-
-/* data handling */
-
-#define S6_NUM_TX_SKB	8	/* must be larger than TX fifo size */
-#define S6_NUM_RX_SKB	16
-#define S6_MAX_FRLEN	1536
-
-struct s6gmac {
-	u32 reg;
-	u32 tx_dma;
-	u32 rx_dma;
-	u32 io;
-	u8 tx_chan;
-	u8 rx_chan;
-	spinlock_t lock;
-	u8 tx_skb_i, tx_skb_o;
-	u8 rx_skb_i, rx_skb_o;
-	struct sk_buff *tx_skb[S6_NUM_TX_SKB];
-	struct sk_buff *rx_skb[S6_NUM_RX_SKB];
-	unsigned long carry[sizeof(struct net_device_stats) / sizeof(long)];
-	unsigned long stats[sizeof(struct net_device_stats) / sizeof(long)];
-	struct phy_device *phydev;
-	struct {
-		struct mii_bus *bus;
-		int irq[PHY_MAX_ADDR];
-	} mii;
-	struct {
-		unsigned int mbit;
-		u8 giga;
-		u8 isup;
-		u8 full;
-	} link;
-};
-
-static void s6gmac_rx_fillfifo(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	struct sk_buff *skb;
-	while ((((u8)(pd->rx_skb_i - pd->rx_skb_o)) < S6_NUM_RX_SKB) &&
-	       (!s6dmac_fifo_full(pd->rx_dma, pd->rx_chan)) &&
-	       (skb = netdev_alloc_skb(dev, S6_MAX_FRLEN + 2))) {
-		pd->rx_skb[(pd->rx_skb_i++) % S6_NUM_RX_SKB] = skb;
-		s6dmac_put_fifo_cache(pd->rx_dma, pd->rx_chan,
-			pd->io, (u32)skb->data, S6_MAX_FRLEN);
-	}
-}
-
-static void s6gmac_rx_interrupt(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	u32 pfx;
-	struct sk_buff *skb;
-	while (((u8)(pd->rx_skb_i - pd->rx_skb_o)) >
-			s6dmac_pending_count(pd->rx_dma, pd->rx_chan)) {
-		skb = pd->rx_skb[(pd->rx_skb_o++) % S6_NUM_RX_SKB];
-		pfx = readl(pd->reg + S6_GMAC_BURST_POSTRD);
-		if (pfx & (1 << S6_GMAC_BURST_POSTRD_DROP)) {
-			dev_kfree_skb_irq(skb);
-		} else {
-			skb_put(skb, (pfx >> S6_GMAC_BURST_POSTRD_LEN)
-				& S6_GMAC_BURST_POSTRD_LEN_MASK);
-			skb->protocol = eth_type_trans(skb, dev);
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			netif_rx(skb);
-		}
-	}
-}
-
-static void s6gmac_tx_interrupt(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	while (((u8)(pd->tx_skb_i - pd->tx_skb_o)) >
-			s6dmac_pending_count(pd->tx_dma, pd->tx_chan)) {
-		dev_kfree_skb_irq(pd->tx_skb[(pd->tx_skb_o++) % S6_NUM_TX_SKB]);
-	}
-	if (!s6dmac_fifo_full(pd->tx_dma, pd->tx_chan))
-		netif_wake_queue(dev);
-}
-
-struct s6gmac_statinf {
-	unsigned reg_size : 4; /* 0: unused */
-	unsigned reg_off : 6;
-	unsigned net_index : 6;
-};
-
-#define S6_STATS_B (8 * sizeof(u32))
-#define S6_STATS_C(b, r, f) [b] = { \
-	BUILD_BUG_ON_ZERO(r##_SIZE < S6_GMAC_STAT_SIZE_MIN) + \
-	BUILD_BUG_ON_ZERO((r##_SIZE - (S6_GMAC_STAT_SIZE_MIN - 1)) \
-			>= (1<<4)) + \
-	r##_SIZE - (S6_GMAC_STAT_SIZE_MIN - 1), \
-	BUILD_BUG_ON_ZERO(((unsigned)((r - S6_GMAC_STAT_REGS) / sizeof(u32))) \
-			>= ((1<<6)-1)) + \
-	(r - S6_GMAC_STAT_REGS) / sizeof(u32), \
-	BUILD_BUG_ON_ZERO((offsetof(struct net_device_stats, f)) \
-			% sizeof(unsigned long)) + \
-	BUILD_BUG_ON_ZERO((((unsigned)(offsetof(struct net_device_stats, f)) \
-			/ sizeof(unsigned long)) >= (1<<6))) + \
-	BUILD_BUG_ON_ZERO((sizeof(((struct net_device_stats *)0)->f) \
-			!= sizeof(unsigned long))) + \
-	(offsetof(struct net_device_stats, f)) / sizeof(unsigned long)},
-
-static const struct s6gmac_statinf statinf[2][S6_STATS_B] = { {
-	S6_STATS_C(S6_GMAC_STATCARRY1_RBYT, S6_GMAC_STATRBYT, rx_bytes)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RPKT, S6_GMAC_STATRPKT, rx_packets)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RFCS, S6_GMAC_STATRFCS, rx_crc_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RMCA, S6_GMAC_STATRMCA, multicast)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RALN, S6_GMAC_STATRALN, rx_frame_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RFLR, S6_GMAC_STATRFLR, rx_length_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RCDE, S6_GMAC_STATRCDE, rx_missed_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RUND, S6_GMAC_STATRUND, rx_length_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_ROVR, S6_GMAC_STATROVR, rx_length_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RFRG, S6_GMAC_STATRFRG, rx_crc_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RJBR, S6_GMAC_STATRJBR, rx_crc_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY1_RDRP, S6_GMAC_STATRDRP, rx_dropped)
-}, {
-	S6_STATS_C(S6_GMAC_STATCARRY2_TBYT, S6_GMAC_STATTBYT, tx_bytes)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TPKT, S6_GMAC_STATTPKT, tx_packets)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TEDF, S6_GMAC_STATTEDF, tx_aborted_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TXCL, S6_GMAC_STATTXCL, tx_aborted_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TNCL, S6_GMAC_STATTNCL, collisions)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TDRP, S6_GMAC_STATTDRP, tx_dropped)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TJBR, S6_GMAC_STATTJBR, tx_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TFCS, S6_GMAC_STATTFCS, tx_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TOVR, S6_GMAC_STATTOVR, tx_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TUND, S6_GMAC_STATTUND, tx_errors)
-	S6_STATS_C(S6_GMAC_STATCARRY2_TFRG, S6_GMAC_STATTFRG, tx_errors)
-} };
-
-static void s6gmac_stats_collect(struct s6gmac *pd,
-		const struct s6gmac_statinf *inf)
-{
-	int b;
-	for (b = 0; b < S6_STATS_B; b++) {
-		if (inf[b].reg_size) {
-			pd->stats[inf[b].net_index] +=
-				readl(pd->reg + S6_GMAC_STAT_REGS
-					+ sizeof(u32) * inf[b].reg_off);
-		}
-	}
-}
-
-static void s6gmac_stats_carry(struct s6gmac *pd,
-		const struct s6gmac_statinf *inf, u32 mask)
-{
-	int b;
-	while (mask) {
-		b = fls(mask) - 1;
-		mask &= ~(1 << b);
-		pd->carry[inf[b].net_index] += (1 << inf[b].reg_size);
-	}
-}
-
-static inline u32 s6gmac_stats_pending(struct s6gmac *pd, int carry)
-{
-	int r = readl(pd->reg + S6_GMAC_STATCARRY(carry)) &
-		~readl(pd->reg + S6_GMAC_STATCARRYMSK(carry));
-	return r;
-}
-
-static inline void s6gmac_stats_interrupt(struct s6gmac *pd, int carry)
-{
-	u32 mask;
-	mask = s6gmac_stats_pending(pd, carry);
-	if (mask) {
-		writel(mask, pd->reg + S6_GMAC_STATCARRY(carry));
-		s6gmac_stats_carry(pd, &statinf[carry][0], mask);
-	}
-}
-
-static irqreturn_t s6gmac_interrupt(int irq, void *dev_id)
-{
-	struct net_device *dev = (struct net_device *)dev_id;
-	struct s6gmac *pd = netdev_priv(dev);
-	if (!dev)
-		return IRQ_NONE;
-	spin_lock(&pd->lock);
-	if (s6dmac_termcnt_irq(pd->rx_dma, pd->rx_chan))
-		s6gmac_rx_interrupt(dev);
-	s6gmac_rx_fillfifo(dev);
-	if (s6dmac_termcnt_irq(pd->tx_dma, pd->tx_chan))
-		s6gmac_tx_interrupt(dev);
-	s6gmac_stats_interrupt(pd, 0);
-	s6gmac_stats_interrupt(pd, 1);
-	spin_unlock(&pd->lock);
-	return IRQ_HANDLED;
-}
-
-static inline void s6gmac_set_dstaddr(struct s6gmac *pd, int n,
-	u32 addrlo, u32 addrhi, u32 masklo, u32 maskhi)
-{
-	writel(addrlo, pd->reg + S6_GMAC_HOST_DSTADDRLO(n));
-	writel(addrhi, pd->reg + S6_GMAC_HOST_DSTADDRHI(n));
-	writel(masklo, pd->reg + S6_GMAC_HOST_DSTMASKLO(n));
-	writel(maskhi, pd->reg + S6_GMAC_HOST_DSTMASKHI(n));
-}
-
-static inline void s6gmac_stop_device(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	writel(0, pd->reg + S6_GMAC_MACCONF1);
-}
-
-static inline void s6gmac_init_device(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	int is_rgmii = !!(pd->phydev->supported
-		& (SUPPORTED_1000baseT_Full | SUPPORTED_1000baseT_Half));
-#if 0
-	writel(1 << S6_GMAC_MACCONF1_SYNCTX |
-		1 << S6_GMAC_MACCONF1_SYNCRX |
-		1 << S6_GMAC_MACCONF1_TXFLOWCTRL |
-		1 << S6_GMAC_MACCONF1_RXFLOWCTRL |
-		1 << S6_GMAC_MACCONF1_RESTXFUNC |
-		1 << S6_GMAC_MACCONF1_RESRXFUNC |
-		1 << S6_GMAC_MACCONF1_RESTXMACCTRL |
-		1 << S6_GMAC_MACCONF1_RESRXMACCTRL,
-		pd->reg + S6_GMAC_MACCONF1);
-#endif
-	writel(1 << S6_GMAC_MACCONF1_SOFTRES, pd->reg + S6_GMAC_MACCONF1);
-	udelay(1000);
-	writel(1 << S6_GMAC_MACCONF1_TXENA | 1 << S6_GMAC_MACCONF1_RXENA,
-		pd->reg + S6_GMAC_MACCONF1);
-	writel(1 << S6_GMAC_HOST_PBLKCTRL_TXSRES |
-		1 << S6_GMAC_HOST_PBLKCTRL_RXSRES,
-		pd->reg + S6_GMAC_HOST_PBLKCTRL);
-	writel(S6_GMAC_HOST_PBLKCTRL_SIZ_128 << S6_GMAC_HOST_PBLKCTRL_TXBSIZ |
-		S6_GMAC_HOST_PBLKCTRL_SIZ_128 << S6_GMAC_HOST_PBLKCTRL_RXBSIZ |
-		1 << S6_GMAC_HOST_PBLKCTRL_STATENA |
-		1 << S6_GMAC_HOST_PBLKCTRL_STATCLEAR |
-		is_rgmii << S6_GMAC_HOST_PBLKCTRL_RGMII,
-		pd->reg + S6_GMAC_HOST_PBLKCTRL);
-	writel(1 << S6_GMAC_MACCONF1_TXENA |
-		1 << S6_GMAC_MACCONF1_RXENA |
-		(dev->flags & IFF_LOOPBACK ? 1 : 0)
-			<< S6_GMAC_MACCONF1_LOOPBACK,
-		pd->reg + S6_GMAC_MACCONF1);
-	writel(dev->mtu && (dev->mtu < (S6_MAX_FRLEN - ETH_HLEN-ETH_FCS_LEN)) ?
-			dev->mtu+ETH_HLEN+ETH_FCS_LEN : S6_MAX_FRLEN,
-		pd->reg + S6_GMAC_MACMAXFRAMELEN);
-	writel((pd->link.full ? 1 : 0) << S6_GMAC_MACCONF2_FULL |
-		1 << S6_GMAC_MACCONF2_PADCRCENA |
-		1 << S6_GMAC_MACCONF2_LENGTHFCHK |
-		(pd->link.giga ?
-			S6_GMAC_MACCONF2_IFMODE_BYTE :
-			S6_GMAC_MACCONF2_IFMODE_NIBBLE)
-			<< S6_GMAC_MACCONF2_IFMODE |
-		7 << S6_GMAC_MACCONF2_PREAMBLELEN,
-		pd->reg + S6_GMAC_MACCONF2);
-	writel(0, pd->reg + S6_GMAC_MACSTATADDR1);
-	writel(0, pd->reg + S6_GMAC_MACSTATADDR2);
-	writel(1 << S6_GMAC_FIFOCONF0_WTMENREQ |
-		1 << S6_GMAC_FIFOCONF0_SRFENREQ |
-		1 << S6_GMAC_FIFOCONF0_FRFENREQ |
-		1 << S6_GMAC_FIFOCONF0_STFENREQ |
-		1 << S6_GMAC_FIFOCONF0_FTFENREQ,
-		pd->reg + S6_GMAC_FIFOCONF0);
-	writel(128 << S6_GMAC_FIFOCONF3_CFGFTTH |
-		128 << S6_GMAC_FIFOCONF3_CFGHWMFT,
-		pd->reg + S6_GMAC_FIFOCONF3);
-	writel((S6_GMAC_FIFOCONF_RSV_MASK & ~(
-			1 << S6_GMAC_FIFOCONF_RSV_RUNT |
-			1 << S6_GMAC_FIFOCONF_RSV_CRCERR |
-			1 << S6_GMAC_FIFOCONF_RSV_OK |
-			1 << S6_GMAC_FIFOCONF_RSV_DRIBBLE |
-			1 << S6_GMAC_FIFOCONF_RSV_CTRLFRAME |
-			1 << S6_GMAC_FIFOCONF_RSV_PAUSECTRL |
-			1 << S6_GMAC_FIFOCONF_RSV_UNOPCODE |
-			1 << S6_GMAC_FIFOCONF_RSV_TRUNCATED)) |
-		1 << S6_GMAC_FIFOCONF5_DROPLT64 |
-		pd->link.giga << S6_GMAC_FIFOCONF5_CFGBYTM |
-		1 << S6_GMAC_FIFOCONF5_RXDROPSIZE,
-		pd->reg + S6_GMAC_FIFOCONF5);
-	writel(1 << S6_GMAC_FIFOCONF_RSV_RUNT |
-		1 << S6_GMAC_FIFOCONF_RSV_CRCERR |
-		1 << S6_GMAC_FIFOCONF_RSV_DRIBBLE |
-		1 << S6_GMAC_FIFOCONF_RSV_CTRLFRAME |
-		1 << S6_GMAC_FIFOCONF_RSV_PAUSECTRL |
-		1 << S6_GMAC_FIFOCONF_RSV_UNOPCODE |
-		1 << S6_GMAC_FIFOCONF_RSV_TRUNCATED,
-		pd->reg + S6_GMAC_FIFOCONF4);
-	s6gmac_set_dstaddr(pd, 0,
-		0xFFFFFFFF, 0x0000FFFF, 0xFFFFFFFF, 0x0000FFFF);
-	s6gmac_set_dstaddr(pd, 1,
-		dev->dev_addr[5] |
-		dev->dev_addr[4] << 8 |
-		dev->dev_addr[3] << 16 |
-		dev->dev_addr[2] << 24,
-		dev->dev_addr[1] |
-		dev->dev_addr[0] << 8,
-		0xFFFFFFFF, 0x0000FFFF);
-	s6gmac_set_dstaddr(pd, 2,
-		0x00000000, 0x00000100, 0x00000000, 0x00000100);
-	s6gmac_set_dstaddr(pd, 3,
-		0x00000000, 0x00000000, 0x00000000, 0x00000000);
-	writel(1 << S6_GMAC_HOST_PBLKCTRL_TXENA |
-		1 << S6_GMAC_HOST_PBLKCTRL_RXENA |
-		S6_GMAC_HOST_PBLKCTRL_SIZ_128 << S6_GMAC_HOST_PBLKCTRL_TXBSIZ |
-		S6_GMAC_HOST_PBLKCTRL_SIZ_128 << S6_GMAC_HOST_PBLKCTRL_RXBSIZ |
-		1 << S6_GMAC_HOST_PBLKCTRL_STATENA |
-		1 << S6_GMAC_HOST_PBLKCTRL_STATCLEAR |
-		is_rgmii << S6_GMAC_HOST_PBLKCTRL_RGMII,
-		pd->reg + S6_GMAC_HOST_PBLKCTRL);
-}
-
-static void s6mii_enable(struct s6gmac *pd)
-{
-	writel(readl(pd->reg + S6_GMAC_MACCONF1) &
-		~(1 << S6_GMAC_MACCONF1_SOFTRES),
-		pd->reg + S6_GMAC_MACCONF1);
-	writel((readl(pd->reg + S6_GMAC_MACMIICONF)
-		& ~(S6_GMAC_MACMIICONF_CSEL_MASK << S6_GMAC_MACMIICONF_CSEL))
-		| (S6_GMAC_MACMIICONF_CSEL_DIV168 << S6_GMAC_MACMIICONF_CSEL),
-		pd->reg + S6_GMAC_MACMIICONF);
-}
-
-static int s6mii_busy(struct s6gmac *pd, int tmo)
-{
-	while (readl(pd->reg + S6_GMAC_MACMIIINDI)) {
-		if (--tmo == 0)
-			return -ETIME;
-		udelay(64);
-	}
-	return 0;
-}
-
-static int s6mii_read(struct mii_bus *bus, int phy_addr, int regnum)
-{
-	struct s6gmac *pd = bus->priv;
-	s6mii_enable(pd);
-	if (s6mii_busy(pd, 256))
-		return -ETIME;
-	writel(phy_addr << S6_GMAC_MACMIIADDR_PHY |
-		regnum << S6_GMAC_MACMIIADDR_REG,
-		pd->reg + S6_GMAC_MACMIIADDR);
-	writel(1 << S6_GMAC_MACMIICMD_READ, pd->reg + S6_GMAC_MACMIICMD);
-	writel(0, pd->reg + S6_GMAC_MACMIICMD);
-	if (s6mii_busy(pd, 256))
-		return -ETIME;
-	return (u16)readl(pd->reg + S6_GMAC_MACMIISTAT);
-}
-
-static int s6mii_write(struct mii_bus *bus, int phy_addr, int regnum, u16 value)
-{
-	struct s6gmac *pd = bus->priv;
-	s6mii_enable(pd);
-	if (s6mii_busy(pd, 256))
-		return -ETIME;
-	writel(phy_addr << S6_GMAC_MACMIIADDR_PHY |
-		regnum << S6_GMAC_MACMIIADDR_REG,
-		pd->reg + S6_GMAC_MACMIIADDR);
-	writel(value, pd->reg + S6_GMAC_MACMIICTRL);
-	if (s6mii_busy(pd, 256))
-		return -ETIME;
-	return 0;
-}
-
-static int s6mii_reset(struct mii_bus *bus)
-{
-	struct s6gmac *pd = bus->priv;
-	s6mii_enable(pd);
-	if (s6mii_busy(pd, PHY_INIT_TIMEOUT))
-		return -ETIME;
-	return 0;
-}
-
-static void s6gmac_set_rgmii_txclock(struct s6gmac *pd)
-{
-	u32 pllsel = readl(S6_REG_GREG1 + S6_GREG1_PLLSEL);
-	pllsel &= ~(S6_GREG1_PLLSEL_GMAC_MASK << S6_GREG1_PLLSEL_GMAC);
-	switch (pd->link.mbit) {
-	case 10:
-		pllsel |= S6_GREG1_PLLSEL_GMAC_2500KHZ << S6_GREG1_PLLSEL_GMAC;
-		break;
-	case 100:
-		pllsel |= S6_GREG1_PLLSEL_GMAC_25MHZ << S6_GREG1_PLLSEL_GMAC;
-		break;
-	case 1000:
-		pllsel |= S6_GREG1_PLLSEL_GMAC_125MHZ << S6_GREG1_PLLSEL_GMAC;
-		break;
-	default:
-		return;
-	}
-	writel(pllsel, S6_REG_GREG1 + S6_GREG1_PLLSEL);
-}
-
-static inline void s6gmac_linkisup(struct net_device *dev, int isup)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	struct phy_device *phydev = pd->phydev;
-
-	pd->link.full = phydev->duplex;
-	pd->link.giga = (phydev->speed == 1000);
-	if (pd->link.mbit != phydev->speed) {
-		pd->link.mbit = phydev->speed;
-		s6gmac_set_rgmii_txclock(pd);
-	}
-	pd->link.isup = isup;
-	if (isup)
-		netif_carrier_on(dev);
-	phy_print_status(phydev);
-}
-
-static void s6gmac_adjust_link(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	struct phy_device *phydev = pd->phydev;
-	if (pd->link.isup &&
-			(!phydev->link ||
-			(pd->link.mbit != phydev->speed) ||
-			(pd->link.full != phydev->duplex))) {
-		pd->link.isup = 0;
-		netif_tx_disable(dev);
-		if (!phydev->link) {
-			netif_carrier_off(dev);
-			phy_print_status(phydev);
-		}
-	}
-	if (!pd->link.isup && phydev->link) {
-		if (pd->link.full != phydev->duplex) {
-			u32 maccfg = readl(pd->reg + S6_GMAC_MACCONF2);
-			if (phydev->duplex)
-				maccfg |= 1 << S6_GMAC_MACCONF2_FULL;
-			else
-				maccfg &= ~(1 << S6_GMAC_MACCONF2_FULL);
-			writel(maccfg, pd->reg + S6_GMAC_MACCONF2);
-		}
-
-		if (pd->link.giga != (phydev->speed == 1000)) {
-			u32 fifocfg = readl(pd->reg + S6_GMAC_FIFOCONF5);
-			u32 maccfg = readl(pd->reg + S6_GMAC_MACCONF2);
-			maccfg &= ~(S6_GMAC_MACCONF2_IFMODE_MASK
-				     << S6_GMAC_MACCONF2_IFMODE);
-			if (phydev->speed == 1000) {
-				fifocfg |= 1 << S6_GMAC_FIFOCONF5_CFGBYTM;
-				maccfg |= S6_GMAC_MACCONF2_IFMODE_BYTE
-					   << S6_GMAC_MACCONF2_IFMODE;
-			} else {
-				fifocfg &= ~(1 << S6_GMAC_FIFOCONF5_CFGBYTM);
-				maccfg |= S6_GMAC_MACCONF2_IFMODE_NIBBLE
-					   << S6_GMAC_MACCONF2_IFMODE;
-			}
-			writel(fifocfg, pd->reg + S6_GMAC_FIFOCONF5);
-			writel(maccfg, pd->reg + S6_GMAC_MACCONF2);
-		}
-
-		if (!s6dmac_fifo_full(pd->tx_dma, pd->tx_chan))
-			netif_wake_queue(dev);
-		s6gmac_linkisup(dev, 1);
-	}
-}
-
-static inline int s6gmac_phy_start(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	int i = 0;
-	struct phy_device *p = NULL;
-	while ((i < PHY_MAX_ADDR) && (!(p = pd->mii.bus->phy_map[i])))
-		i++;
-	p = phy_connect(dev, dev_name(&p->dev), &s6gmac_adjust_link,
-			PHY_INTERFACE_MODE_RGMII);
-	if (IS_ERR(p)) {
-		printk(KERN_ERR "%s: Could not attach to PHY\n", dev->name);
-		return PTR_ERR(p);
-	}
-	p->supported &= PHY_GBIT_FEATURES;
-	p->advertising = p->supported;
-	pd->phydev = p;
-	return 0;
-}
-
-static inline void s6gmac_init_stats(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	u32 mask;
-	mask =	1 << S6_GMAC_STATCARRY1_RDRP |
-		1 << S6_GMAC_STATCARRY1_RJBR |
-		1 << S6_GMAC_STATCARRY1_RFRG |
-		1 << S6_GMAC_STATCARRY1_ROVR |
-		1 << S6_GMAC_STATCARRY1_RUND |
-		1 << S6_GMAC_STATCARRY1_RCDE |
-		1 << S6_GMAC_STATCARRY1_RFLR |
-		1 << S6_GMAC_STATCARRY1_RALN |
-		1 << S6_GMAC_STATCARRY1_RMCA |
-		1 << S6_GMAC_STATCARRY1_RFCS |
-		1 << S6_GMAC_STATCARRY1_RPKT |
-		1 << S6_GMAC_STATCARRY1_RBYT;
-	writel(mask, pd->reg + S6_GMAC_STATCARRY(0));
-	writel(~mask, pd->reg + S6_GMAC_STATCARRYMSK(0));
-	mask =	1 << S6_GMAC_STATCARRY2_TDRP |
-		1 << S6_GMAC_STATCARRY2_TNCL |
-		1 << S6_GMAC_STATCARRY2_TXCL |
-		1 << S6_GMAC_STATCARRY2_TEDF |
-		1 << S6_GMAC_STATCARRY2_TPKT |
-		1 << S6_GMAC_STATCARRY2_TBYT |
-		1 << S6_GMAC_STATCARRY2_TFRG |
-		1 << S6_GMAC_STATCARRY2_TUND |
-		1 << S6_GMAC_STATCARRY2_TOVR |
-		1 << S6_GMAC_STATCARRY2_TFCS |
-		1 << S6_GMAC_STATCARRY2_TJBR;
-	writel(mask, pd->reg + S6_GMAC_STATCARRY(1));
-	writel(~mask, pd->reg + S6_GMAC_STATCARRYMSK(1));
-}
-
-static inline void s6gmac_init_dmac(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	s6dmac_disable_chan(pd->tx_dma, pd->tx_chan);
-	s6dmac_disable_chan(pd->rx_dma, pd->rx_chan);
-	s6dmac_disable_error_irqs(pd->tx_dma, 1 << S6_HIFDMA_GMACTX);
-	s6dmac_disable_error_irqs(pd->rx_dma, 1 << S6_HIFDMA_GMACRX);
-}
-
-static int s6gmac_tx(struct sk_buff *skb, struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	unsigned long flags;
-
-	spin_lock_irqsave(&pd->lock, flags);
-	writel(skb->len << S6_GMAC_BURST_PREWR_LEN |
-		0 << S6_GMAC_BURST_PREWR_CFE |
-		1 << S6_GMAC_BURST_PREWR_PPE |
-		1 << S6_GMAC_BURST_PREWR_FCS |
-		((skb->len < ETH_ZLEN) ? 1 : 0) << S6_GMAC_BURST_PREWR_PAD,
-		pd->reg + S6_GMAC_BURST_PREWR);
-	s6dmac_put_fifo_cache(pd->tx_dma, pd->tx_chan,
-		(u32)skb->data, pd->io, skb->len);
-	if (s6dmac_fifo_full(pd->tx_dma, pd->tx_chan))
-		netif_stop_queue(dev);
-	if (((u8)(pd->tx_skb_i - pd->tx_skb_o)) >= S6_NUM_TX_SKB) {
-		printk(KERN_ERR "GMAC BUG: skb tx ring overflow [%x, %x]\n",
-			pd->tx_skb_o, pd->tx_skb_i);
-		BUG();
-	}
-	pd->tx_skb[(pd->tx_skb_i++) % S6_NUM_TX_SKB] = skb;
-	spin_unlock_irqrestore(&pd->lock, flags);
-	return 0;
-}
-
-static void s6gmac_tx_timeout(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	unsigned long flags;
-	spin_lock_irqsave(&pd->lock, flags);
-	s6gmac_tx_interrupt(dev);
-	spin_unlock_irqrestore(&pd->lock, flags);
-}
-
-static int s6gmac_open(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	unsigned long flags;
-	phy_read_status(pd->phydev);
-	spin_lock_irqsave(&pd->lock, flags);
-	pd->link.mbit = 0;
-	s6gmac_linkisup(dev, pd->phydev->link);
-	s6gmac_init_device(dev);
-	s6gmac_init_stats(dev);
-	s6gmac_init_dmac(dev);
-	s6gmac_rx_fillfifo(dev);
-	s6dmac_enable_chan(pd->rx_dma, pd->rx_chan,
-		2, 1, 0, 1, 0, 0, 0, 7, -1, 2, 0, 1);
-	s6dmac_enable_chan(pd->tx_dma, pd->tx_chan,
-		2, 0, 1, 0, 0, 0, 0, 7, -1, 2, 0, 1);
-	writel(0 << S6_GMAC_HOST_INT_TXBURSTOVER |
-		0 << S6_GMAC_HOST_INT_TXPREWOVER |
-		0 << S6_GMAC_HOST_INT_RXBURSTUNDER |
-		0 << S6_GMAC_HOST_INT_RXPOSTRFULL |
-		0 << S6_GMAC_HOST_INT_RXPOSTRUNDER,
-		pd->reg + S6_GMAC_HOST_INTMASK);
-	spin_unlock_irqrestore(&pd->lock, flags);
-	phy_start(pd->phydev);
-	netif_start_queue(dev);
-	return 0;
-}
-
-static int s6gmac_stop(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	unsigned long flags;
-	netif_stop_queue(dev);
-	phy_stop(pd->phydev);
-	spin_lock_irqsave(&pd->lock, flags);
-	s6gmac_init_dmac(dev);
-	s6gmac_stop_device(dev);
-	while (pd->tx_skb_i != pd->tx_skb_o)
-		dev_kfree_skb(pd->tx_skb[(pd->tx_skb_o++) % S6_NUM_TX_SKB]);
-	while (pd->rx_skb_i != pd->rx_skb_o)
-		dev_kfree_skb(pd->rx_skb[(pd->rx_skb_o++) % S6_NUM_RX_SKB]);
-	spin_unlock_irqrestore(&pd->lock, flags);
-	return 0;
-}
-
-static struct net_device_stats *s6gmac_stats(struct net_device *dev)
-{
-	struct s6gmac *pd = netdev_priv(dev);
-	struct net_device_stats *st = (struct net_device_stats *)&pd->stats;
-	int i;
-	do {
-		unsigned long flags;
-		spin_lock_irqsave(&pd->lock, flags);
-		for (i = 0; i < ARRAY_SIZE(pd->stats); i++)
-			pd->stats[i] =
-				pd->carry[i] << (S6_GMAC_STAT_SIZE_MIN - 1);
-		s6gmac_stats_collect(pd, &statinf[0][0]);
-		s6gmac_stats_collect(pd, &statinf[1][0]);
-		i = s6gmac_stats_pending(pd, 0) |
-			s6gmac_stats_pending(pd, 1);
-		spin_unlock_irqrestore(&pd->lock, flags);
-	} while (i);
-	st->rx_errors = st->rx_crc_errors +
-			st->rx_frame_errors +
-			st->rx_length_errors +
-			st->rx_missed_errors;
-	st->tx_errors += st->tx_aborted_errors;
-	return st;
-}
-
-static int s6gmac_probe(struct platform_device *pdev)
-{
-	struct net_device *dev;
-	struct s6gmac *pd;
-	int res;
-	unsigned long i;
-	struct mii_bus *mb;
-
-	dev = alloc_etherdev(sizeof(*pd));
-	if (!dev)
-		return -ENOMEM;
-
-	dev->open = s6gmac_open;
-	dev->stop = s6gmac_stop;
-	dev->hard_start_xmit = s6gmac_tx;
-	dev->tx_timeout = s6gmac_tx_timeout;
-	dev->watchdog_timeo = HZ;
-	dev->get_stats = s6gmac_stats;
-	dev->irq = platform_get_irq(pdev, 0);
-	pd = netdev_priv(dev);
-	memset(pd, 0, sizeof(*pd));
-	spin_lock_init(&pd->lock);
-	pd->reg = platform_get_resource(pdev, IORESOURCE_MEM, 0)->start;
-	i = platform_get_resource(pdev, IORESOURCE_DMA, 0)->start;
-	pd->tx_dma = DMA_MASK_DMAC(i);
-	pd->tx_chan = DMA_INDEX_CHNL(i);
-	i = platform_get_resource(pdev, IORESOURCE_DMA, 1)->start;
-	pd->rx_dma = DMA_MASK_DMAC(i);
-	pd->rx_chan = DMA_INDEX_CHNL(i);
-	pd->io = platform_get_resource(pdev, IORESOURCE_IO, 0)->start;
-	res = request_irq(dev->irq, s6gmac_interrupt, 0, dev->name, dev);
-	if (res) {
-		printk(KERN_ERR DRV_PRMT "irq request failed: %d\n", dev->irq);
-		goto errirq;
-	}
-	res = register_netdev(dev);
-	if (res) {
-		printk(KERN_ERR DRV_PRMT "error registering device %s\n",
-			dev->name);
-		goto errdev;
-	}
-	mb = mdiobus_alloc();
-	if (!mb) {
-		printk(KERN_ERR DRV_PRMT "error allocating mii bus\n");
-		res = -ENOMEM;
-		goto errmii;
-	}
-	mb->name = "s6gmac_mii";
-	mb->read = s6mii_read;
-	mb->write = s6mii_write;
-	mb->reset = s6mii_reset;
-	mb->priv = pd;
-	snprintf(mb->id, MII_BUS_ID_SIZE, "%s-%x", pdev->name, pdev->id);
-	mb->phy_mask = ~(1 << 0);
-	mb->irq = &pd->mii.irq[0];
-	for (i = 0; i < PHY_MAX_ADDR; i++) {
-		int n = platform_get_irq(pdev, i + 1);
-		if (n < 0)
-			n = PHY_POLL;
-		pd->mii.irq[i] = n;
-	}
-	mdiobus_register(mb);
-	pd->mii.bus = mb;
-	res = s6gmac_phy_start(dev);
-	if (res)
-		return res;
-	platform_set_drvdata(pdev, dev);
-	return 0;
-errmii:
-	unregister_netdev(dev);
-errdev:
-	free_irq(dev->irq, dev);
-errirq:
-	free_netdev(dev);
-	return res;
-}
-
-static int s6gmac_remove(struct platform_device *pdev)
-{
-	struct net_device *dev = platform_get_drvdata(pdev);
-	if (dev) {
-		struct s6gmac *pd = netdev_priv(dev);
-		mdiobus_unregister(pd->mii.bus);
-		unregister_netdev(dev);
-		free_irq(dev->irq, dev);
-		free_netdev(dev);
-	}
-	return 0;
-}
-
-static struct platform_driver s6gmac_driver = {
-	.probe = s6gmac_probe,
-	.remove = s6gmac_remove,
-	.driver = {
-		.name = "s6gmac",
-	},
-};
-
-module_platform_driver(s6gmac_driver);
-
-MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("S6105 on chip Ethernet driver");
-MODULE_AUTHOR("Oskar Schirmer <oskar@scara.com>");
-- 
2.1.2

^ permalink raw reply related

* Re: SRIOV as bridge Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Roopa Prabhu @ 2014-12-21 19:36 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: John Fastabend, Hubert Sokolowski, netdev@vger.kernel.org,
	Vlad Yasevich, Shrijeet Mukherjee
In-Reply-To: <54971D2F.1090802@mojatatu.com>

On 12/21/14, 11:19 AM, Jamal Hadi Salim wrote:
> On 12/21/14 14:08, Roopa Prabhu wrote:
>> PF still stays but not as the management interface.
>>
>> Even if 'TheClassThingy' where there, you wouldn't refer to it as the
>> master (ie the PF will not have a netdev master/slave relationship with
>> the VF). 'master' will still be used for the netdev 'upper' device if
>> VF was enslaved to one (which could be a bridge).
>>
>
> Well, there is an embedded switch underneath the VFs (in hardware).
> You cant send pkts from one VF to another without going through this
> switch (or in VEPA mode via it). i.e you dont need a kernel bridge.
understood. And since you don't need the kernel bridge, you don't need 
the kernel netdev master construct here.
>
> So in essence the VF is a bridge port to  this embedded switch (as
> is the PF). So the role of master points downwards from the kernel.
> Master is just not visible at the kernel. 
exactly. So you will not be able to use the kernel 'master' in this case.
> I am not sure what "self"
> would mean in this case.
'self' would just mean the driver owns the PF embedded bridge and the 
kernel bridge driver has no role in this. 'self' will just tell the VF 
driver to deal with the fdb mac entry. And the VF driver can push the 
fdb to the PF  (John can confirm if the intel sriov devices really do it 
this way or some other way).

> This is why i dont think current switchdev approach would work.
Current switchdev code in the kernel supports a driver managing its own 
switch with 'self' calls. ie bypassing the kernel bridge driver. Since 
sriov devices already did it that way...., they will just continue to 
work. They are not broken.
But, yes, for sriov devices to use the switchdev model...may need some work.

Good thing you brought up these points. A BOF to close on these things 
at netdev will be good.

Thanks,
Roopa

^ permalink raw reply

* Re: SRIOV as bridge Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: John Fastabend @ 2014-12-21 19:52 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Jamal Hadi Salim, Hubert Sokolowski, netdev@vger.kernel.org,
	Vlad Yasevich, Shrijeet Mukherjee
In-Reply-To: <54971A93.6050700@cumulusnetworks.com>

On 12/21/2014 11:08 AM, Roopa Prabhu wrote:
> On 12/21/14, 6:27 AM, Jamal Hadi Salim wrote:
>>
>> Sorry for the latency, Ive been down with a bad flu (its bad when i cant
>> type on my keyboard sitting infront of me;->),  recovering and the
>> thread seems to have caught on - should be able to catchup in the
>> next few days.
>> I am beginning to reach a conclusion that the current switchdev approach
>> is *not* going to work for SRIOV. I also worry it may be too late
>> to change that.
>> Shrijeet wanted to set up a BOF for netdev to have hopefully final
>> consensus. Shrijeet, are you going to make an official request for the
>> BOF?
>>
>> Sorry John, I dont have enough energy to address all your points but i
>> will try to just focus on SRIOV and will save a few bytes while at it.
>>

not a problem thanks for the response. I might try to document this
somewhere if folks think it would be useful. Something describing
how it works today would be helpful is my thought. Showing the
various stacked cases and how messages get propagated. (Some cases
being with bridge, without bridge, with bridge and multiple uplinks,
with bridge + VLAN filtering, macvlan, SR-IOV + bridge + VMDQ, etc.)

Its not a small task so likely won't get to it until after the new
year.

>>
>> On 12/16/14 11:35, John Fastabend wrote:
>>
>>> But in the SR-IOV case you have multiple "Cpu ports" and you want
>>> to send packets to each of them depending on the configuration.
>>>
>>>
>>>     port0   port1     port2  port3
>>>      |        |        |      |      uplinks
>>>   +------------------------------+
>>>   |                              |
>>>   |       SRIOV edge relay       |
>>>   |                              |
>>>   +------------------------------+
>>>                   |                   downlink
>>>
>>>
>>
>> Two points above:
>> 1) Did you flip uplink vs down link above?
>> (I Thought URP was the wire link)

yes sorry typo hopefully not too confusing.

>> 2) What you are not showing above which is *very important* is that
>> infact there is an underlying embedded fdb.

Yes. There is an embedded FDB.

>>
>> point #2 brings out a lot of the weird things in some of the bridge
>> code. IOW, you have an *offloaded* bridge with _bridge ports_
>> visible in the kernel but not the bridge that is controlled
>> by standard Linux bridge tools. I am not saying that the model is
>> wrong; on the contrary what Ben had exposed may fall under the
>> same category i.e you have E_BRIDGE flag on the netdev to say it sits
>> on top of an offloaded bridge and you dont need a br0 to run
>> bridge command on. But then we need some proxy (TheClassThingy) to act
>> as intemediary to the offloaded hardware.
>> If you do that then the vf becomes simply a bridge port - which
>> means bridge port ops apply.
>>
>> SRIOV it seems to have morphed its own toolkit.

Yes, but I don't think its too late to bring it into the picture here.

>> The PF port, when acting as the control interface, is actually
>> TheClassThingy we discuss on/off.

Yep or if you take Jiri's approach any port on the nic could be used
to manage this.

>> To add an fdb entry to point to vf 1, where TheClassThingy is eth1:
>> ip link set eth1 vf 1 mac aa:bb:cc:dd:ee:ff vlan 10
>>
>> IMO, SRIOV should expose these ports with names and ifindices
>> (probably does already) and pre-populated master or something
>> which points to its parent, then i can do the following:
>> bridge fdb add aa:bb:cc:dd:ee:ff vlan 10 dev vf1 master
> I had a slightly different understanding of how this would work for
> SRIOV. So, am attempting to respond to your questions for John..., ...so
> that he can correct my understanding too ..if needed :).
>
> I think SRIOV VF's do have netdevs (John can confirm, I maybe wrong). To
> me if SRIOV has a single fdb for all VF's under a PF,
> and it wants to bypass the bridge driver, there is still no reason to
> refer to the PF as a master.
> You can use self and go to the vf driver directly and it will do the
> right thing.

The VF's may have netdev's if they are in the host. In this case you
could use 'bridge fdb' to manage them. In many use cases though the
VFs are directly assigned to VMs and then are outside the hosts
management domain. For this case you can either let the host tell the
driver which addresses it would want to receive.

Another _idea_ would be to create a "shadow" netdev in the host
to manage the port even when the VF is direct assigned. Then you
could use all your normal commands from the host to set the MTU,
set any MACs, etc. At the moment as Jamal noted we have a subset
of 'ip link' commands that we use to work on VFs when they are not
in the host domain.

	'ip link set ethx vf # ...'

In the SR-IOV case you would have a PF and then a set of eth-vf#
netdev's which are not attached to a VF but act as the management
interface for the port.

>
> bridge fdb add aa:bb:cc:dd:ee:ff vlan 10 dev vf1 self
>
>>
>> master in such a case will go to TheClassThingy which would pass
>> such control to the underlying hardware.
>> The PF still stays but not as the management interface.
>

I think this is not specific to SR-IOV though right. This is the
same point for both "real" switch ASICs and SR-IOV. Using the netdev
directly as a management interace (a la rocker) seems to work OK.
But does it become cleaner to have the switch object represented
explicitly for management.

> Even if 'TheClassThingy' where there, you wouldn't refer to it as the
> master (ie the PF will not have a netdev master/slave relationship with
> the VF). 'master' will still be used for the netdev 'upper'  device if
> VF was enslaved to one (which could be a bridge).
>

Sounds right to me.

>
> Thanks,
> Roopa
>
>


-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* Re: [PATCH net] net/mlx4_en: Doorbell is byteswapped in Little Endian archs
From: Sergei Shtylyov @ 2014-12-21 19:53 UTC (permalink / raw)
  To: Amir Vadai, David S. Miller
  Cc: netdev, Or Gerlitz, Yevgeny Petrilin, Wei Yang, David Laight
In-Reply-To: <1419185938-28987-1-git-send-email-amirv@mellanox.com>

Hello.

On 12/21/2014 9:18 PM, Amir Vadai wrote:

> iowrite32() will byteswap it's argument on big endian archs.
> iowrite32be() will byteswap on little endian archs.
> Since we don't want to do this unnecessary byteswap on the fast path,
> doorbell is stored in the NIC's native endianness. Using the right
> iowrite() according to the arch endianness.

> CC: Wei Yang <weiyang@linux.vnet.ibm.com>
> CC: David Laight <david.laight@aculab.com>
> Fixes: 6a4e812 ("net/mlx4_en: Avoid calling bswap in tx fast path")
> Signed-off-by: Amir Vadai <amirv@mellanox.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/en_tx.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> index a308d41..6477cc7 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> @@ -962,7 +962,16 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>   		tx_desc->ctrl.owner_opcode = op_own;
>   		if (send_doorbell) {
>   			wmb();
> -			iowrite32(ring->doorbell_qpn,
> +		/* Since there is no iowrite*_native() that writes the value
> +		 * as is, without byteswapping - using the one the doesn't do
> +		 * byteswapping in the relevant arch endianness.
> +		 */

    Why the comment is not aligned with the code?

> +#if defined(__LITTLE_ENDIAN)
> +			iowrite32(
> +#else
> +			iowrite32be(
> +#endif

    Ugh...

> +				  ring->doorbell_qpn,
>   				  ring->bf.uar->map + MLX4_SEND_DOORBELL);
[...]

WBR, Sergei

^ permalink raw reply

* Re: SRIOV as bridge Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Jamal Hadi Salim @ 2014-12-21 20:06 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: John Fastabend, Hubert Sokolowski, netdev@vger.kernel.org,
	Vlad Yasevich, Shrijeet Mukherjee
In-Reply-To: <54972140.30704@cumulusnetworks.com>

On 12/21/14 14:36, Roopa Prabhu wrote:
> On 12/21/14, 11:19 AM, Jamal Hadi Salim wrote:
>> On 12/21/14 14:08, Roopa Prabhu wrote:

> 'self' would just mean the driver owns the PF embedded bridge and the
> kernel bridge driver has no role in this. 'self' will just tell the VF
> driver to deal with the fdb mac entry. And the VF driver can push the
> fdb to the PF  (John can confirm if the intel sriov devices really do it
> this way or some other way).
>

If the VFs are exposed as netdevs and the master (or Parent which i 
think the "P" stands for) they point to
is the PF - then is the PF a bridge abstraction?
And if the path is via is the PF - i think that seems like "master"
not self, no?

cheers,
jamal

^ permalink raw reply

* Re: SRIOV as bridge Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Roopa Prabhu @ 2014-12-21 20:46 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: John Fastabend, Hubert Sokolowski, netdev@vger.kernel.org,
	Vlad Yasevich, Shrijeet Mukherjee
In-Reply-To: <54972859.7050108@mojatatu.com>

On 12/21/14, 12:06 PM, Jamal Hadi Salim wrote:
> On 12/21/14 14:36, Roopa Prabhu wrote:
>> On 12/21/14, 11:19 AM, Jamal Hadi Salim wrote:
>>> On 12/21/14 14:08, Roopa Prabhu wrote:
>
>> 'self' would just mean the driver owns the PF embedded bridge and the
>> kernel bridge driver has no role in this. 'self' will just tell the VF
>> driver to deal with the fdb mac entry. And the VF driver can push the
>> fdb to the PF  (John can confirm if the intel sriov devices really do it
>> this way or some other way).
>>
>
> If the VFs are exposed as netdevs and the master (or Parent which i 
> think the "P" stands for) they point to
> is the PF - then is the PF a bridge abstraction?
yes, could be, but its not today ('PF' is physical function and 'VF' is 
virtual function).
If you introduce a master/slave relationship between the PF and VF (ie 
VF's were assigned PF as the master using 'ip link set dev vf master 
PF), then yes.
> And if the path is via is the PF - i think that seems like "master"
> not self, no?

Today ...when you add fdb...path is not via the PF netdev. It maybe 
internally done that way in PF/VF driver.
so, 'master' does not apply today. But if there were such a relationship 
between PF/VF, yes, 'master' could be used.

PF does not really need to have a master relationship with the VF. Its 
better that way. Infact it should be that way even in the case of 'the 
switch device class model' because that will allow switch ports to be 
added to a linux bridge (and hence make use of the linux bridge (cumulus 
model). 'master' will be the 'linux bridge device' in this case).

Thanks,
Roopa

^ permalink raw reply

* [PATCH 00/28] remove .owner for most platform_drivers: the missing bits
From: Wolfram Sang @ 2014-12-21 21:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Wolfram Sang, alsa-devel, ath5k-devel, Dan Williams, devicetree,
	dmaengine, dri-devel, iommu, linux-arm-kernel, linux-gpio,
	linux-mips, linux-omap, linux-pci, linux-pm, linuxppc-dev,
	linux-rockchip, linux-samsung-soc, linux-scsi, linux-serial,
	linux-usb, linux-watchdog, linux-wireless, netdev,
	openipmi-developer, rtc-linux

Generated with coccinelle. The big cleanup was pulled in this merge window.
This series catches the bits fallen through. The patches shall go in via the
subsystem trees. If possible for 3.19 to increase consistency I'd say, but you
decide, of course.

cocci-file used:

@match1@
declarer name module_platform_driver;
declarer name module_platform_driver_probe;
declarer name for_each_node_by_type;
identifier __driver;
@@
(
	module_platform_driver(__driver);
|
	module_platform_driver_probe(__driver, ...);
)

@fix1 depends on match1@
identifier match1.__driver;
@@
	static struct platform_driver __driver = {
		.driver = {
-			.owner = THIS_MODULE,
		}
	};

@match2@
identifier __driver;
@@
(
	platform_driver_register(&__driver)
|
	platform_driver_probe(&__driver, ...)
|
	platform_create_bundle(&__driver, ...)
)

@fix2 depends on match2@
identifier match2.__driver;
@@
	static struct platform_driver __driver = {
		.driver = {
-			.owner = THIS_MODULE,
		}
	};

Thanks again to Julia Lawall for support. And hey, we fixed a coccinelle bug on
the way :)


Wolfram Sang (28):
  ARM: mach-exynos: drop owner assignment from platform_drivers
  mips: lantiq: xway: drop owner assignment from platform_drivers
  mips: pci: drop owner assignment from platform_drivers
  char: ipmi: drop owner assignment from platform_drivers
  cpufreq: drop owner assignment from platform_drivers
  dma: drop owner assignment from platform_drivers
  gpio: drop owner assignment from platform_drivers
  gpu: drm: rockchip: drop owner assignment from platform_drivers
  iommu: drop owner assignment from platform_drivers
  net: ethernet: stmicro: stmmac: drop owner assignment from
    platform_drivers
  net: wireless: ath: ath5k: drop owner assignment from platform_drivers
  of: drop owner assignment from platform_drivers
  pci: host: drop owner assignment from platform_drivers
  phy: drop owner assignment from platform_drivers
  pinctrl: intel: drop owner assignment from platform_drivers
  rtc: drop owner assignment from platform_drivers
  scsi: drop owner assignment from platform_drivers
  thermal: drop owner assignment from platform_drivers
  thermal: int340x_thermal: drop owner assignment from platform_drivers
  tty: serial: 8250: drop owner assignment from platform_drivers
  usb: gadget: udc: bdc: drop owner assignment from platform_drivers
  watchdog: drop owner assignment from platform_drivers
  ASoC: intel: drop owner assignment from platform_drivers
  ASoC: intel: sst: drop owner assignment from platform_drivers
  ASoC: omap: drop owner assignment from platform_drivers
  ASoC: pxa: drop owner assignment from platform_drivers
  ASoC: samsung: drop owner assignment from platform_drivers
  macintosh: drop owner assignment from platform_drivers

 arch/arm/mach-exynos/pmu.c                            | 1 -
 arch/mips/lantiq/xway/vmmc.c                          | 1 -
 arch/mips/pci/pci-ar2315.c                            | 1 -
 arch/mips/pci/pci-rt2880.c                            | 1 -
 drivers/char/ipmi/ipmi_powernv.c                      | 1 -
 drivers/cpufreq/ls1x-cpufreq.c                        | 1 -
 drivers/dma/at_xdmac.c                                | 1 -
 drivers/gpio/gpio-vf610.c                             | 1 -
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c           | 1 -
 drivers/iommu/rockchip-iommu.c                        | 1 -
 drivers/macintosh/windfarm_pm112.c                    | 1 -
 drivers/macintosh/windfarm_pm72.c                     | 1 -
 drivers/macintosh/windfarm_rm31.c                     | 1 -
 drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 1 -
 drivers/net/wireless/ath/ath5k/ahb.c                  | 1 -
 drivers/of/unittest.c                                 | 1 -
 drivers/pci/host/pci-layerscape.c                     | 1 -
 drivers/phy/phy-armada375-usb2.c                      | 1 -
 drivers/phy/phy-berlin-usb.c                          | 1 -
 drivers/phy/phy-miphy28lp.c                           | 1 -
 drivers/pinctrl/intel/pinctrl-cherryview.c            | 1 -
 drivers/rtc/rtc-opal.c                                | 1 -
 drivers/scsi/atari_scsi.c                             | 1 -
 drivers/scsi/mac_scsi.c                               | 1 -
 drivers/scsi/sun3_scsi.c                              | 1 -
 drivers/thermal/int340x_thermal/int3400_thermal.c     | 1 -
 drivers/thermal/int340x_thermal/int3402_thermal.c     | 1 -
 drivers/thermal/rockchip_thermal.c                    | 1 -
 drivers/tty/serial/8250/8250_omap.c                   | 1 -
 drivers/usb/gadget/udc/bdc/bdc_core.c                 | 1 -
 drivers/watchdog/cadence_wdt.c                        | 1 -
 drivers/watchdog/meson_wdt.c                          | 1 -
 sound/soc/intel/bytcr_dpcm_rt5640.c                   | 1 -
 sound/soc/intel/cht_bsw_rt5672.c                      | 1 -
 sound/soc/intel/sst/sst_acpi.c                        | 1 -
 sound/soc/omap/omap-hdmi-audio.c                      | 1 -
 sound/soc/pxa/spitz.c                                 | 1 -
 sound/soc/samsung/arndale_rt5631.c                    | 1 -
 38 files changed, 38 deletions(-)

-- 
2.1.3

^ permalink raw reply

* [PATCH 10/28] net: ethernet: stmicro: stmmac: drop owner assignment from platform_drivers
From: Wolfram Sang @ 2014-12-21 21:14 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wolfram Sang, Giuseppe Cavallaro, netdev
In-Reply-To: <1419196495-9626-1-git-send-email-wsa@the-dreams.de>

This platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
---
Generated with coccinelle. SmPL file is in the introductory msg. The big
cleanup was pulled in this merge window. This series catches the bits fallen
through. The patches shall go in via the subsystem trees.

 drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 4032b170fe24..3039de2465ba 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -430,7 +430,6 @@ static struct platform_driver stmmac_pltfr_driver = {
 	.remove = stmmac_pltfr_remove,
 	.driver = {
 		   .name = STMMAC_RESOURCE_NAME,
-		   .owner = THIS_MODULE,
 		   .pm = &stmmac_pltfr_pm_ops,
 		   .of_match_table = of_match_ptr(stmmac_dt_ids),
 	},
-- 
2.1.3

^ permalink raw reply related

* [PATCH 11/28] net: wireless: ath: ath5k: drop owner assignment from platform_drivers
From: Wolfram Sang @ 2014-12-21 21:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: Wolfram Sang, Jiri Slaby, Nick Kossifidis, Luis R. Rodriguez,
	Kalle Valo, linux-wireless, ath5k-devel, netdev
In-Reply-To: <1419196495-9626-1-git-send-email-wsa@the-dreams.de>

This platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
---
Generated with coccinelle. SmPL file is in the introductory msg. The big
cleanup was pulled in this merge window. This series catches the bits fallen
through. The patches shall go in via the subsystem trees.

 drivers/net/wireless/ath/ath5k/ahb.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath5k/ahb.c b/drivers/net/wireless/ath/ath5k/ahb.c
index 8f387cf67340..2ca88b593e4c 100644
--- a/drivers/net/wireless/ath/ath5k/ahb.c
+++ b/drivers/net/wireless/ath/ath5k/ahb.c
@@ -227,7 +227,6 @@ static struct platform_driver ath_ahb_driver = {
 	.remove     = ath_ahb_remove,
 	.driver		= {
 		.name	= "ar231x-wmac",
-		.owner	= THIS_MODULE,
 	},
 };
 
-- 
2.1.3

^ permalink raw reply related

* Re: [BUG] rtl8192se: panic accessing unmapped memory in skb
From: Larry Finger @ 2014-12-21 23:02 UTC (permalink / raw)
  To: Eric Biggers, linux-wireless; +Cc: netdev, linux-kernel
In-Reply-To: <20141221172516.GA12784@zzz>

[-- Attachment #1: Type: text/plain, Size: 2339 bytes --]

On 12/21/2014 11:25 AM, Eric Biggers wrote:
> Hi,
>
> I have a RTL8192SE wireless card, attached via PCI.  Usually it works with no
> issues, but I recently had a kernel panic occur in the rtl8192se driver.  The
> kernel version is 3.18.  Based on my analysis of the panic dump, the panic was
> caused by a memory access violation in this block of code in
> rtl92se_rx_query_desc():
>
>          if (stats->decrypted) {
>                  hdr = (struct ieee80211_hdr *)(skb->data +
>                         stats->rx_drvinfo_size + stats->rx_bufshift);
>
>                  if ((_ieee80211_is_robust_mgmt_frame(hdr)) &&
>                          (ieee80211_has_protected(hdr->frame_control)))
>                          rx_status->flag &= ~RX_FLAG_DECRYPTED;
>                  else
>                          rx_status->flag |= RX_FLAG_DECRYPTED;
>          }
>
> Specifically, the violation occurred the first time hdr->frame_control was
> accessed, as part of _ieee80211_is_robust_mgmt_frame().
>
> The panic occurred when the system was under heavy filesystem load but seemingly
> is not easily reproducible.
>
> There was recently a NULL check that was removed from this exact place in the
> code, but it was certainly useless.  Instead, what's much more suspect to me is
> that inside _rtl_pci_rx_interrupt(), there is no error checking of the return
> value of _rtl_pci_init_one_rxdesc(), which might fail if the skb couldn't be
> allocated.  I am wondering if this could be causing the problem.

Your analysis is probably correct; however, I'm not sure what to do if the 
allocate of an skb fails. As the name says, this routine is entered through an 
interrupt, and I'm not sure what to do other than to exit.

The attached patch will implement the exit after logging an error. Please patch 
your system and report back.

How much RAM does your system have? That info might be useful in trying to 
reproduce the problem, which might indeed be difficult. Although pci.c was 
extensively reworked in the 3.17 => 3.18 transition, most of the changes were 
added to implement the changed descriptor structure for the RTL8192EE, and I do 
not remember any changes that would affect any of the other drivers. As a 
result, the current structure has been in place for some time, and this problem 
has not been reported before.

Larry


[-- Attachment #2: rtlwifi_report_add_new_rxdesc_failure --]
[-- Type: text/plain, Size: 2180 bytes --]

diff --git a/drivers/net/wireless/rtlwifi/pci.c b/drivers/net/wireless/rtlwifi/pci.c
index 846a2e6..c576c71 100644
--- a/drivers/net/wireless/rtlwifi/pci.c
+++ b/drivers/net/wireless/rtlwifi/pci.c
@@ -676,7 +676,7 @@ static int _rtl_pci_init_one_rxdesc(struct ieee80211_hw *hw,
 
 	skb = dev_alloc_skb(rtlpci->rxbuffersize);
 	if (!skb)
-		return 0;
+		return -ENOMEM;;
 	rtlpci->rx_ring[rxring_idx].rx_buf[desc_idx] = skb;
 
 	/* just set skb->cb to mapping addr for pci_unmap_single use */
@@ -685,7 +685,7 @@ static int _rtl_pci_init_one_rxdesc(struct ieee80211_hw *hw,
 			       rtlpci->rxbuffersize, PCI_DMA_FROMDEVICE);
 	bufferaddress = *((dma_addr_t *)skb->cb);
 	if (pci_dma_mapping_error(rtlpci->pdev, bufferaddress))
-		return 0;
+		return -EFAULT;
 	if (rtlpriv->use_new_trx_flow) {
 		rtlpriv->cfg->ops->set_desc(hw, (u8 *)entry, false,
 					    HW_DESC_RX_PREPARE,
@@ -701,7 +701,7 @@ static int _rtl_pci_init_one_rxdesc(struct ieee80211_hw *hw,
 					    HW_DESC_RXOWN,
 					    (u8 *)&tmp_one);
 	}
-	return 1;
+	return 0;
 }
 
 /* inorder to receive 8K AMSDU we have set skb to
@@ -768,6 +768,7 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
 		.signal = 0,
 		.rate = 0,
 	};
+	int err;
 
 	/*RX NORMAL PKT */
 	while (count--) {
@@ -912,13 +913,21 @@ static void _rtl_pci_rx_interrupt(struct ieee80211_hw *hw)
 		}
 end:
 		if (rtlpriv->use_new_trx_flow) {
-			_rtl_pci_init_one_rxdesc(hw, (u8 *)buffer_desc,
-						 rxring_idx,
-					       rtlpci->rx_ring[rxring_idx].idx);
+			err = _rtl_pci_init_one_rxdesc(hw, (u8 *)buffer_desc,
+						       rxring_idx,
+						       rtlpci->rx_ring[rxring_idx].idx);
+			if (err) {
+				pr_err("%s Failed to init RX descriptor\n");
+				return;
+			}
 		} else {
-			_rtl_pci_init_one_rxdesc(hw, (u8 *)pdesc, rxring_idx,
-						 rtlpci->rx_ring[rxring_idx].idx);
-
+			err = _rtl_pci_init_one_rxdesc(hw, (u8 *)pdesc,
+						       rxring_idx,
+						       rtlpci->rx_ring[rxring_idx].idx);
+			if (err) {
+				pr_err("%s Failed to init RX descriptor\n");
+				return;
+			}
 			if (rtlpci->rx_ring[rxring_idx].idx ==
 			    rtlpci->rxringcount - 1)
 				rtlpriv->cfg->ops->set_desc(hw, (u8 *)pdesc,


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox