Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] networking: fm10k: Fix build failure
From: David Miller @ 2014-10-10  5:01 UTC (permalink / raw)
  To: bobby.prani
  Cc: linux.nics, e1000-devel, bruce.w.allan, jesse.brandeburg,
	linux-kernel, john.ronciak, netdev
In-Reply-To: <1412916929-21592-1-git-send-email-bobby.prani@gmail.com>

From: Pranith Kumar <bobby.prani@gmail.com>
Date: Fri, 10 Oct 2014 00:55:29 -0400

> The latest linus git tip (3.18-rc1) fails with the following build failure. Fix
> this by making PTP support explicit for fm10k driver.
> 
> rivers/built-in.o: In function `fm10k_ptp_register':
> (.text+0x12e760): undefined reference to `ptp_clock_registER'
> drivers/built-in.o: In function `fm10k_ptp_unregister':
> (.text+0x12e7dc): undefined reference to `ptp_clock_unregister'
> Makefile:930: recipe for target 'vmlinux' failed
> 
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>

Please follow what other drivers do, which is to use "select" on this
Kconfig symbol.

Thanks.

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH net-next 0/2] sunvnet: Packet processing in non-interrupt context.
From: David Miller @ 2014-10-10  5:03 UTC (permalink / raw)
  To: Raghuram.Kothakota; +Cc: sowmini.varadhan, netdev
In-Reply-To: <960E9AF5-0319-44C7-9A1C-37EAEE52E78E@oracle.com>

From: Raghuram Kothakota <Raghuram.Kothakota@oracle.com>
Date: Thu, 9 Oct 2014 21:56:45 -0700

> Sorry, I used incorrect terminology in my email. My knowledge of LLTX
> is limited and I am still learning. I was not referring to the LLTX, but  about 
> the implementation of  sunvnet transmit path and receive paths without locks. To me that
> means only one thread of execution exists at a given time and I was
> referring to it as single threadedness,  which limits performance on SPARC CMT
> processors today. Using  methods to increase parallelism will help especially
> when the traffic involves multiple connections, mainly from the point of view
> of using multiple vCPUs to perform the processing where possible. 

Let's not speak in generalities, but rather about specifical
implementations of specific things.

Linux's TX path it fully parallelized and multiqueue, so you can have
as many parallel TX threads of control executing over a specific
device as you can provide TX queues.

^ permalink raw reply

* Re: [PATCH net-next 0/2] sunvnet: Packet processing in non-interrupt context.
From: Raghuram Kothakota @ 2014-10-10  5:13 UTC (permalink / raw)
  To: David Miller; +Cc: sowmini.varadhan, netdev
In-Reply-To: <20141010.010320.1545286759850518718.davem@davemloft.net>

> 
> Linux's TX path it fully parallelized and multiqueue, so you can have
> as many parallel TX threads of control executing over a specific
> device as you can provide TX queues.

Thanks, I guess we need to explore Tx queues as well.

-Raghuram
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH v2] networking: fm10k: Fix build failure
From: Pranith Kumar @ 2014-10-10  5:19 UTC (permalink / raw)
  To: Jeff Kirsher, Jesse Brandeburg, Bruce Allan, Carolyn Wyborny,
	Don Skidmore, Greg Rose, Matthew Vick, John Ronciak,
	Mitch Williams, Linux NICS, open list:INTEL ETHERNET DR...,
	open list:NETWORKING DRIVERS, open list

The latest linus git tip (3.18-rc1) fails with the following build failure. Fix
this by making PTP support explicit for fm10k driver.

rivers/built-in.o: In function `fm10k_ptp_register':
(.text+0x12e760): undefined reference to `ptp_clock_registER'
drivers/built-in.o: In function `fm10k_ptp_unregister':
(.text+0x12e7dc): undefined reference to `ptp_clock_unregister'
Makefile:930: recipe for target 'vmlinux' failed

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 drivers/net/ethernet/intel/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index 6a6d5ee..6919adb 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -304,6 +304,7 @@ config FM10K
 	tristate "Intel(R) FM10000 Ethernet Switch Host Interface Support"
 	default n
 	depends on PCI_MSI
+	select PTP_1588_CLOCK
 	---help---
 	  This driver supports Intel(R) FM10000 Ethernet Switch Host
 	  Interface.  For more information on how to identify your adapter,
-- 
1.9.1


------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply related

* Re: [PATCH v2] networking: fm10k: Fix build failure
From: David Miller @ 2014-10-10  5:20 UTC (permalink / raw)
  To: bobby.prani
  Cc: linux.nics, e1000-devel, bruce.w.allan, jesse.brandeburg,
	linux-kernel, john.ronciak, netdev
In-Reply-To: <1412918346-24763-1-git-send-email-bobby.prani@gmail.com>

From: Pranith Kumar <bobby.prani@gmail.com>
Date: Fri, 10 Oct 2014 01:19:06 -0400

> The latest linus git tip (3.18-rc1) fails with the following build failure. Fix
> this by making PTP support explicit for fm10k driver.
> 
> rivers/built-in.o: In function `fm10k_ptp_register':
> (.text+0x12e760): undefined reference to `ptp_clock_registER'
> drivers/built-in.o: In function `fm10k_ptp_unregister':
> (.text+0x12e7dc): undefined reference to `ptp_clock_unregister'
> Makefile:930: recipe for target 'vmlinux' failed
> 
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>

Applied, thanks.

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH v2] networking: fm10k: Fix build failure
From: Jeff Kirsher @ 2014-10-10  5:22 UTC (permalink / raw)
  To: Pranith Kumar
  Cc: Don, open list:INTEL ETHERNET DR..., Bruce Allan,
	Jesse Brandeburg, open list, Ronciak,
	open list:NETWORKING DRIVERS, Linux NICS, John
In-Reply-To: <1412918346-24763-1-git-send-email-bobby.prani@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 1354 bytes --]

On Fri, 2014-10-10 at 01:19 -0400, Pranith Kumar wrote:
> The latest linus git tip (3.18-rc1) fails with the following build failure. Fix
> this by making PTP support explicit for fm10k driver.
> 
> rivers/built-in.o: In function `fm10k_ptp_register':
> (.text+0x12e760): undefined reference to `ptp_clock_registER'
> drivers/built-in.o: In function `fm10k_ptp_unregister':
> (.text+0x12e7dc): undefined reference to `ptp_clock_unregister'
> Makefile:930: recipe for target 'vmlinux' failed
> 
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Dave- go ahead and pull this in directly, no need for me to put this
through our internal process.

> ---
>  drivers/net/ethernet/intel/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
> index 6a6d5ee..6919adb 100644
> --- a/drivers/net/ethernet/intel/Kconfig
> +++ b/drivers/net/ethernet/intel/Kconfig
> @@ -304,6 +304,7 @@ config FM10K
>  	tristate "Intel(R) FM10000 Ethernet Switch Host Interface Support"
>  	default n
>  	depends on PCI_MSI
> +	select PTP_1588_CLOCK
>  	---help---
>  	  This driver supports Intel(R) FM10000 Ethernet Switch Host
>  	  Interface.  For more information on how to identify your adapter,



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 430 bytes --]

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]

_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* [PATCH net 0/5] net: fix races accessing page->_count
From: Eric Dumazet @ 2014-10-10  5:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Alexander Duyck, Andres Lagar-Cavilla, Greg Thelen,
	Hugh Dickins, David Rientjes, Eric Dumazet

This is illegal to use atomic_set(&page->_count, ...) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.

The only case it is valid is when page->_count is 0, we can use this in
__netdev_alloc_frag()

Note that I never seen crashes caused by these races, the issue was reported
by Andres Lagar-Cavilla and Hugh Dickins.

Eric Dumazet (5):
  fm10k: fix race accessing page->_count
  igb: fix race accessing page->_count
  igb: fix race accessing page->_count
  mlx4: fix race accessing page->_count
  net: fix races in page->_count manipulation

 drivers/net/ethernet/intel/fm10k/fm10k_main.c |  7 +++----
 drivers/net/ethernet/intel/igb/igb_main.c     |  7 +++----
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  8 +++-----
 drivers/net/ethernet/mellanox/mlx4/en_rx.c    |  6 +++---
 net/core/skbuff.c                             | 25 ++++++++++++++++++-------
 5 files changed, 30 insertions(+), 23 deletions(-)

-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply

* [PATCH net 1/5] fm10k: fix race accessing page->_count
From: Eric Dumazet @ 2014-10-10  5:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Alexander Duyck, Andres Lagar-Cavilla, Greg Thelen,
	Hugh Dickins, David Rientjes, Eric Dumazet
In-Reply-To: <1412918694-22882-1-git-send-email-edumazet@google.com>

This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 6c800a330d66..9d7118a0d67a 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -219,11 +219,10 @@ static bool fm10k_can_reuse_rx_page(struct fm10k_rx_buffer *rx_buffer,
 	/* flip page offset to other buffer */
 	rx_buffer->page_offset ^= FM10K_RX_BUFSZ;
 
-	/* since we are the only owner of the page and we need to
-	 * increment it, just set the value to 2 in order to avoid
-	 * an unnecessary locked operation
+	/* Even if we own the page, we are not allowed to use atomic_set()
+	 * This would break get_page_unless_zero() users.
 	 */
-	atomic_set(&page->_count, 2);
+	atomic_inc(&page->_count);
 #else
 	/* move offset up to the next cache line */
 	rx_buffer->page_offset += truesize;
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* [PATCH net 2/5] igb: fix race accessing page->_count
From: Eric Dumazet @ 2014-10-10  5:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Alexander Duyck, Andres Lagar-Cavilla, Greg Thelen,
	Hugh Dickins, David Rientjes, Eric Dumazet
In-Reply-To: <1412918694-22882-1-git-send-email-edumazet@google.com>

This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index ae59c0b108c5..a21b14495ebd 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6545,11 +6545,10 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
 	/* flip page offset to other buffer */
 	rx_buffer->page_offset ^= IGB_RX_BUFSZ;
 
-	/* since we are the only owner of the page and we need to
-	 * increment it, just set the value to 2 in order to avoid
-	 * an unnecessary locked operation
+	/* Even if we own the page, we are not allowed to use atomic_set()
+	 * This would break get_page_unless_zero() users.
 	 */
-	atomic_set(&page->_count, 2);
+	atomic_inc(&page->_count);
 #else
 	/* move offset up to the next cache line */
 	rx_buffer->page_offset += truesize;
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* [PATCH net 3/5] igb: fix race accessing page->_count
From: Eric Dumazet @ 2014-10-10  5:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Alexander Duyck, Andres Lagar-Cavilla, Greg Thelen,
	Hugh Dickins, David Rientjes, Eric Dumazet
In-Reply-To: <1412918694-22882-1-git-send-email-edumazet@google.com>

This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index d677b5a23b58..fec5212d4337 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1865,12 +1865,10 @@ static bool ixgbe_add_rx_frag(struct ixgbe_ring *rx_ring,
 	/* flip page offset to other buffer */
 	rx_buffer->page_offset ^= truesize;
 
-	/*
-	 * since we are the only owner of the page and we need to
-	 * increment it, just set the value to 2 in order to avoid
-	 * an unecessary locked operation
+	/* Even if we own the page, we are not allowed to use atomic_set()
+	 * This would break get_page_unless_zero() users.
 	 */
-	atomic_set(&page->_count, 2);
+	atomic_inc(&page->_count);
 #else
 	/* move offset up to the next cache line */
 	rx_buffer->page_offset += truesize;
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* [PATCH net 4/5] mlx4: fix race accessing page->_count
From: Eric Dumazet @ 2014-10-10  5:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Alexander Duyck, Andres Lagar-Cavilla, Greg Thelen,
	Hugh Dickins, David Rientjes, Eric Dumazet
In-Reply-To: <1412918694-22882-1-git-send-email-edumazet@google.com>

This is illegal to use atomic_set(&page->_count, ...) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index a33048ee9621..01660c595f5c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -76,10 +76,10 @@ static int mlx4_alloc_pages(struct mlx4_en_priv *priv,
 	page_alloc->dma = dma;
 	page_alloc->page_offset = frag_info->frag_align;
 	/* Not doing get_page() for each frag is a big win
-	 * on asymetric workloads.
+	 * on asymetric workloads. Note we can not use atomic_set().
 	 */
-	atomic_set(&page->_count,
-		   page_alloc->page_size / frag_info->frag_stride);
+	atomic_add(page_alloc->page_size / frag_info->frag_stride - 1,
+		   &page->_count);
 	return 0;
 }
 
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* [PATCH net 5/5] net: fix races in page->_count manipulation
From: Eric Dumazet @ 2014-10-10  5:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Alexander Duyck, Andres Lagar-Cavilla, Greg Thelen,
	Hugh Dickins, David Rientjes, Eric Dumazet
In-Reply-To: <1412918694-22882-1-git-send-email-edumazet@google.com>

This is illegal to use atomic_set(&page->_count, ...) even if we 'own'
the page. Other entities in the kernel need to use get_page_unless_zero()
to get a reference to the page before testing page properties, so we could
loose a refcount increment.

The only case it is valid is when page->_count is 0

Fixes: 540eb7bf0bbed ("net: Update alloc frag to reduce get/put page usage and recycle pages")
Signed-off-by: Eric Dumaze <edumazet@google.com>
---
 net/core/skbuff.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a30d750647e7..829d013745ab 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -360,18 +360,29 @@ refill:
 				goto end;
 		}
 		nc->frag.size = PAGE_SIZE << order;
-recycle:
-		atomic_set(&nc->frag.page->_count, NETDEV_PAGECNT_MAX_BIAS);
+		/* Even if we own the page, we do not use atomic_set().
+		 * This would break get_page_unless_zero() users.
+		 */
+		atomic_add(NETDEV_PAGECNT_MAX_BIAS - 1,
+			   &nc->frag.page->_count);
 		nc->pagecnt_bias = NETDEV_PAGECNT_MAX_BIAS;
 		nc->frag.offset = 0;
 	}
 
 	if (nc->frag.offset + fragsz > nc->frag.size) {
-		/* avoid unnecessary locked operations if possible */
-		if ((atomic_read(&nc->frag.page->_count) == nc->pagecnt_bias) ||
-		    atomic_sub_and_test(nc->pagecnt_bias, &nc->frag.page->_count))
-			goto recycle;
-		goto refill;
+		if (atomic_read(&nc->frag.page->_count) != nc->pagecnt_bias) {
+			if (!atomic_sub_and_test(nc->pagecnt_bias,
+						 &nc->frag.page->_count))
+				goto refill;
+			/* OK, page count is 0, we can safely set it */
+			atomic_set(&nc->frag.page->_count,
+				   NETDEV_PAGECNT_MAX_BIAS);
+		} else {
+			atomic_add(NETDEV_PAGECNT_MAX_BIAS - nc->pagecnt_bias,
+				   &nc->frag.page->_count);
+		}
+		nc->pagecnt_bias = NETDEV_PAGECNT_MAX_BIAS;
+		nc->frag.offset = 0;
 	}
 
 	data = page_address(nc->frag.page) + nc->frag.offset;
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* Re: [PATCH net 0/5] net: fix races accessing page->_count
From: Jeff Kirsher @ 2014-10-10  5:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, netdev, Alexander Duyck, Andres Lagar-Cavilla,
	Greg Thelen, Hugh Dickins, David Rientjes
In-Reply-To: <1412918694-22882-1-git-send-email-edumazet@google.com>

On Thu, Oct 9, 2014 at 10:24 PM, Eric Dumazet <edumazet@google.com> wrote:
> This is illegal to use atomic_set(&page->_count, ...) even if we 'own'
> the page. Other entities in the kernel need to use get_page_unless_zero()
> to get a reference to the page before testing page properties, so we could
> loose a refcount increment.
>
> The only case it is valid is when page->_count is 0, we can use this in
> __netdev_alloc_frag()
>
> Note that I never seen crashes caused by these races, the issue was reported
> by Andres Lagar-Cavilla and Hugh Dickins.
>
> Eric Dumazet (5):
>   fm10k: fix race accessing page->_count
>   igb: fix race accessing page->_count
>   igb: fix race accessing page->_count

Looks like the ixgbe patch has the incorrect title, or you patch igb twice. :-)

>   mlx4: fix race accessing page->_count
>   net: fix races in page->_count manipulation
>
>  drivers/net/ethernet/intel/fm10k/fm10k_main.c |  7 +++----
>  drivers/net/ethernet/intel/igb/igb_main.c     |  7 +++----
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  8 +++-----
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c    |  6 +++---
>  net/core/skbuff.c                             | 25 ++++++++++++++++++-------
>  5 files changed, 30 insertions(+), 23 deletions(-)
>
> --
> 2.1.0.rc2.206.gedb03e5
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Cheers,
Jeff

^ permalink raw reply

* Re: [PATCH net 0/5] net: fix races accessing page->_count
From: Eric Dumazet @ 2014-10-10  5:42 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Eric Dumazet, David S. Miller, netdev, Alexander Duyck,
	Andres Lagar-Cavilla, Greg Thelen, Hugh Dickins, David Rientjes
In-Reply-To: <CAL3LdT4B3eN55ALtJFxC0ZxgkopOg+5B5JbpDajXJ=0-QOVfuA@mail.gmail.com>

On Thu, 2014-10-09 at 22:37 -0700, Jeff Kirsher wrote:

> Looks like the ixgbe patch has the incorrect title, or you patch igb twice. :-)

Yes, typo in the title, but content is OK, sorry.

^ permalink raw reply

* Re: [PATCH net] stmmac: correct mc_filter local variable in set_filter and set_mac_addr call
From: Giuseppe CAVALLARO @ 2014-10-10  5:47 UTC (permalink / raw)
  To: Vince Bridgers, netdev, linux-kernel; +Cc: vbridger
In-Reply-To: <1412867436-22153-1-git-send-email-vbridger@opensource.altera.com>

On 10/9/2014 5:10 PM, Vince Bridgers wrote:
> Testing revealed that the local variable mc_filter was dimensioned
> incorrectly for all possible configurations and get_mac_addr should
> have been set_mac_addr (a typo). Make sure mc_filter is dimensioned
> to 8 32-bit unsigned longs - the largest size of the Synopsys
> multicast filter register set.
>
> Signed-off-by: Vince Bridgers <vbridger@opensource.altera.com>

Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>

> ---
>   drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> index 5efe60e..0adcf73 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
> @@ -134,7 +134,7 @@ static void dwmac1000_set_filter(struct mac_device_info *hw,
>   	void __iomem *ioaddr = (void __iomem *)dev->base_addr;
>   	unsigned int value = 0;
>   	unsigned int perfect_addr_number = hw->unicast_filter_entries;
> -	u32 mc_filter[2];
> +	u32 mc_filter[8];
>   	int mcbitslog2 = hw->mcast_bits_log2;
>
>   	pr_debug("%s: # mcasts %d, # unicast %d\n", __func__,
> @@ -182,7 +182,7 @@ static void dwmac1000_set_filter(struct mac_device_info *hw,
>   		struct netdev_hw_addr *ha;
>
>   		netdev_for_each_uc_addr(ha, dev) {
> -			stmmac_get_mac_addr(ioaddr, ha->addr,
> +			stmmac_set_mac_addr(ioaddr, ha->addr,
>   					    GMAC_ADDR_HIGH(reg),
>   					    GMAC_ADDR_LOW(reg));
>   			reg++;
>

^ permalink raw reply

* Re: [PATCH net 1/5] fm10k: fix race accessing page->_count
From: Jeff Kirsher @ 2014-10-10  5:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, netdev, Alexander Duyck, Andres Lagar-Cavilla,
	Greg Thelen, Hugh Dickins, David Rientjes
In-Reply-To: <1412918694-22882-2-git-send-email-edumazet@google.com>

On Thu, Oct 9, 2014 at 10:24 PM, Eric Dumazet <edumazet@google.com> wrote:
> This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
> the page. Other entities in the kernel need to use get_page_unless_zero()
> to get a reference to the page before testing page properties, so we could
> loose a refcount increment.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Since this is apart of a series, if the changes to skbuff are ok, then
the changes to the Intel drivers are ok.

> ---
>  drivers/net/ethernet/intel/fm10k/fm10k_main.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
> index 6c800a330d66..9d7118a0d67a 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
> @@ -219,11 +219,10 @@ static bool fm10k_can_reuse_rx_page(struct fm10k_rx_buffer *rx_buffer,
>         /* flip page offset to other buffer */
>         rx_buffer->page_offset ^= FM10K_RX_BUFSZ;
>
> -       /* since we are the only owner of the page and we need to
> -        * increment it, just set the value to 2 in order to avoid
> -        * an unnecessary locked operation
> +       /* Even if we own the page, we are not allowed to use atomic_set()
> +        * This would break get_page_unless_zero() users.
>          */
> -       atomic_set(&page->_count, 2);
> +       atomic_inc(&page->_count);
>  #else
>         /* move offset up to the next cache line */
>         rx_buffer->page_offset += truesize;
> --
> 2.1.0.rc2.206.gedb03e5
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Cheers,
Jeff

^ permalink raw reply

* Re: [PATCH net 3/5] igb: fix race accessing page->_count
From: Jeff Kirsher @ 2014-10-10  5:54 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, netdev, Alexander Duyck, Andres Lagar-Cavilla,
	Greg Thelen, Hugh Dickins, David Rientjes
In-Reply-To: <1412918694-22882-4-git-send-email-edumazet@google.com>

On Thu, Oct 9, 2014 at 10:24 PM, Eric Dumazet <edumazet@google.com> wrote:
> This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
> the page. Other entities in the kernel need to use get_page_unless_zero()
> to get a reference to the page before testing page properties, so we could
> loose a refcount increment.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Change the title to :ixgbe: ...", then you have my ACK.
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Since this is apart of a series, if the changes to skbuff are ok, then
the changes to the Intel drivers are ok.

> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index d677b5a23b58..fec5212d4337 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -1865,12 +1865,10 @@ static bool ixgbe_add_rx_frag(struct ixgbe_ring *rx_ring,
>         /* flip page offset to other buffer */
>         rx_buffer->page_offset ^= truesize;
>
> -       /*
> -        * since we are the only owner of the page and we need to
> -        * increment it, just set the value to 2 in order to avoid
> -        * an unecessary locked operation
> +       /* Even if we own the page, we are not allowed to use atomic_set()
> +        * This would break get_page_unless_zero() users.
>          */
> -       atomic_set(&page->_count, 2);
> +       atomic_inc(&page->_count);
>  #else
>         /* move offset up to the next cache line */
>         rx_buffer->page_offset += truesize;
> --
> 2.1.0.rc2.206.gedb03e5
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Cheers,
Jeff

^ permalink raw reply

* Re: [PATCH net 2/5] igb: fix race accessing page->_count
From: Jeff Kirsher @ 2014-10-10  5:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, netdev, Alexander Duyck, Andres Lagar-Cavilla,
	Greg Thelen, Hugh Dickins, David Rientjes
In-Reply-To: <1412918694-22882-3-git-send-email-edumazet@google.com>

On Thu, Oct 9, 2014 at 10:24 PM, Eric Dumazet <edumazet@google.com> wrote:
> This is illegal to use atomic_set(&page->_count, 2) even if we 'own'
> the page. Other entities in the kernel need to use get_page_unless_zero()
> to get a reference to the page before testing page properties, so we could
> loose a refcount increment.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Since this is apart of a series, if the changes to skbuff are ok, then
the changes to the Intel drivers are ok.

> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index ae59c0b108c5..a21b14495ebd 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -6545,11 +6545,10 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer,
>         /* flip page offset to other buffer */
>         rx_buffer->page_offset ^= IGB_RX_BUFSZ;
>
> -       /* since we are the only owner of the page and we need to
> -        * increment it, just set the value to 2 in order to avoid
> -        * an unnecessary locked operation
> +       /* Even if we own the page, we are not allowed to use atomic_set()
> +        * This would break get_page_unless_zero() users.
>          */
> -       atomic_set(&page->_count, 2);
> +       atomic_inc(&page->_count);
>  #else
>         /* move offset up to the next cache line */
>         rx_buffer->page_offset += truesize;
> --
> 2.1.0.rc2.206.gedb03e5
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Cheers,
Jeff

^ permalink raw reply

* [PATCH][net-next][V2] net: filter: fix the comments
From: roy.qing.li @ 2014-10-10  5:56 UTC (permalink / raw)
  To: netdev; +Cc: ast, alexei.starovoitov

From: Li RongQing <roy.qing.li@gmail.com>

1. sk_run_filter has been renamed, sk_filter() is using SK_RUN_FILTER.
2. Remove wrong comments about storing intermediate value.
3. replace sk_run_filter with __bpf_prog_run for check_load_and_stores's
comments

Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 net/core/filter.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index fcd3f67..647b122 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -51,9 +51,9 @@
  *	@skb: buffer to filter
  *
  * Run the filter code and then cut skb->data to correct size returned by
- * sk_run_filter. If pkt_len is 0 we toss packet. If skb->len is smaller
+ * SK_RUN_FILTER. If pkt_len is 0 we toss packet. If skb->len is smaller
  * than pkt_len we keep whole skb->data. This is the socket level
- * wrapper to sk_run_filter. It returns 0 if the packet should
+ * wrapper to SK_RUN_FILTER. It returns 0 if the packet should
  * be accepted or -EPERM if the packet should be tossed.
  *
  */
@@ -566,11 +566,8 @@ err:
 
 /* Security:
  *
- * A BPF program is able to use 16 cells of memory to store intermediate
- * values (check u32 mem[BPF_MEMWORDS] in sk_run_filter()).
- *
  * As we dont want to clear mem[] array for each packet going through
- * sk_run_filter(), we check that filter loaded by user never try to read
+ * __bpf_prog_run(), we check that filter loaded by user never try to read
  * a cell if not previously written, and we check all branches to be sure
  * a malicious user doesn't try to abuse us.
  */
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH net 1/3] net: bcmgenet: fix off-by-one in incrementing read pointer
From: Petri Gynther @ 2014-10-10  6:01 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, David Miller, jaedon.shin
In-Reply-To: <1412903197-19193-2-git-send-email-f.fainelli@gmail.com>

Hi Florian,

On Thu, Oct 9, 2014 at 6:06 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> Commit b629be5c8399d7c423b92135eb43a86c924d1cbc ("net: bcmgenet: check
> harder for out of memory conditions") moved the increment of the local
> read pointer *before* reading from the hardware descriptor using
> dmadesc_get_length_status(), which creates an off-by-one situation.
>
> Fix this by moving again the read_ptr increment after we have read the
> hardware descriptor to get both the control block and the read pointer
> back in sync.
>
> Fixes: b629be5c8399 ("net: bcmgenet: check harder for out of memory conditions")
> Reported-by: Jaedon Shin <jaedon.shin@gmail.com>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  drivers/net/ethernet/broadcom/genet/bcmgenet.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> index fff2634b6f34..f1bcebcbba80 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> @@ -1287,9 +1287,6 @@ static unsigned int bcmgenet_desc_rx(struct bcmgenet_priv *priv,
>
>                 rxpktprocessed++;
>
> -               priv->rx_read_ptr++;
> -               priv->rx_read_ptr &= (priv->num_rx_bds - 1);
> -

Wouldn't it be better to move the three lines:
rxpktprocessed++;
priv->rx_read_ptr++;
priv->rx_read_ptr &= (priv->num_rx_bds - 1)

as the last lines of the while-loop, after the CB refill?

-- Petri


>                 /* We do not have a backing SKB, so we do not have a
>                  * corresponding DMA mapping for this incoming packet since
>                  * bcmgenet_rx_refill always either has both skb and mapping or
> @@ -1332,6 +1329,9 @@ static unsigned int bcmgenet_desc_rx(struct bcmgenet_priv *priv,
>                           __func__, p_index, priv->rx_c_index,
>                           priv->rx_read_ptr, dma_length_status);
>
> +               priv->rx_read_ptr++;
> +               priv->rx_read_ptr &= (priv->num_rx_bds - 1);
> +
>                 if (unlikely(!(dma_flag & DMA_EOP) || !(dma_flag & DMA_SOP))) {
>                         netif_err(priv, rx_status, dev,
>                                   "dropping fragmented packet!\n");
> --
> 1.9.1
>

^ permalink raw reply

* Re: [PATCH][net-next][V2] net: filter: fix the comments
From: Alexei Starovoitov @ 2014-10-10  6:01 UTC (permalink / raw)
  To: roy.qing.li, Daniel Borkmann; +Cc: Network Development, Alexei Starovoitov
In-Reply-To: <1412920611-2094-1-git-send-email-roy.qing.li@gmail.com>

On Thu, Oct 9, 2014 at 10:56 PM,  <roy.qing.li@gmail.com> wrote:
> From: Li RongQing <roy.qing.li@gmail.com>
>
> 1. sk_run_filter has been renamed, sk_filter() is using SK_RUN_FILTER.
> 2. Remove wrong comments about storing intermediate value.
> 3. replace sk_run_filter with __bpf_prog_run for check_load_and_stores's
> comments
>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Acked-by: Alexei Starovoitov <ast@plumgrid.com>

Thanks!

> ---
>  net/core/filter.c |    9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index fcd3f67..647b122 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -51,9 +51,9 @@
>   *     @skb: buffer to filter
>   *
>   * Run the filter code and then cut skb->data to correct size returned by
> - * sk_run_filter. If pkt_len is 0 we toss packet. If skb->len is smaller
> + * SK_RUN_FILTER. If pkt_len is 0 we toss packet. If skb->len is smaller
>   * than pkt_len we keep whole skb->data. This is the socket level
> - * wrapper to sk_run_filter. It returns 0 if the packet should
> + * wrapper to SK_RUN_FILTER. It returns 0 if the packet should
>   * be accepted or -EPERM if the packet should be tossed.
>   *
>   */
> @@ -566,11 +566,8 @@ err:
>
>  /* Security:
>   *
> - * A BPF program is able to use 16 cells of memory to store intermediate
> - * values (check u32 mem[BPF_MEMWORDS] in sk_run_filter()).
> - *
>   * As we dont want to clear mem[] array for each packet going through
> - * sk_run_filter(), we check that filter loaded by user never try to read
> + * __bpf_prog_run(), we check that filter loaded by user never try to read
>   * a cell if not previously written, and we check all branches to be sure
>   * a malicious user doesn't try to abuse us.
>   */
> --
> 1.7.10.4
>

^ permalink raw reply

* [PATCH v3 2/3] net: fec: ptp: Use hardware algorithm to adjust PTP counter.
From: Luwei Zhou @ 2014-10-10  5:15 UTC (permalink / raw)
  To: davem, richardcochran
  Cc: netdev, shawn.guo, bhutchings, R49496, b38611, b20596, stephen
In-Reply-To: <1412918130-18830-1-git-send-email-b45643@freescale.com>

The FEC IP supports hardware adjustment for ptp timer. Refer to the description of
ENET_ATCOR and ENET_ATINC registers in the spec about the hardware adjustment. This
patch uses hardware support to adjust the ptp offset and frequency on the slave side.

Signed-off-by: Luwei Zhou <b45643@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: Fugang Duan <b38611@freescale.com>
---
 drivers/net/ethernet/freescale/fec.h     |  3 ++
 drivers/net/ethernet/freescale/fec_ptp.c | 65 ++++++++++++++++++++++++++------
 2 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 1d5e182..b0e6025 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -484,6 +484,9 @@ struct fec_enet_private {
 	unsigned int itr_clk_rate;
 
 	u32 rx_copybreak;
+
+	/* ptp clock period in ns*/
+	unsigned int ptp_inc;
 };
 
 void fec_ptp_init(struct platform_device *pdev);
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index 8016bdd..f5ee460 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -145,32 +145,59 @@ void fec_ptp_start_cyclecounter(struct net_device *ndev)
  */
 static int fec_ptp_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
 {
-	u64 diff;
 	unsigned long flags;
 	int neg_adj = 0;
-	u32 mult = FEC_CC_MULT;
+	u32 i, tmp;
+	u32 corr_inc, corr_period;
+	u32 corr_ns;
+	u64 lhs, rhs;
 
 	struct fec_enet_private *fep =
 	    container_of(ptp, struct fec_enet_private, ptp_caps);
 
+	if (ppb == 0)
+		return 0;
+
 	if (ppb < 0) {
 		ppb = -ppb;
 		neg_adj = 1;
 	}
 
-	diff = mult;
-	diff *= ppb;
-	diff = div_u64(diff, 1000000000ULL);
+	/* In theory, corr_inc/corr_period = ppb/NSEC_PER_SEC;
+	 * Try to find the corr_inc  between 1 to fep->ptp_inc to
+	 * meet adjustment requirement.
+	 */
+	lhs = NSEC_PER_SEC;
+	rhs = (u64)ppb * (u64)fep->ptp_inc;
+	for (i = 1; i <= fep->ptp_inc; i++) {
+		if (lhs >= rhs) {
+			corr_inc = i;
+			corr_period = div_u64(lhs, rhs);
+			break;
+		}
+		lhs += NSEC_PER_SEC;
+	}
+	/* Not found? Set it to high value - double speed
+	 * correct in every clock step.
+	 */
+	if (i > fep->ptp_inc) {
+		corr_inc = fep->ptp_inc;
+		corr_period = 1;
+	}
+
+	if (neg_adj)
+		corr_ns = fep->ptp_inc - corr_inc;
+	else
+		corr_ns = fep->ptp_inc + corr_inc;
 
 	spin_lock_irqsave(&fep->tmreg_lock, flags);
-	/*
-	 * dummy read to set cycle_last in tc to now.
-	 * So use adjusted mult to calculate when next call
-	 * timercounter_read.
-	 */
-	timecounter_read(&fep->tc);
 
-	fep->cc.mult = neg_adj ? mult - diff : mult + diff;
+	tmp = readl(fep->hwp + FEC_ATIME_INC) & FEC_T_INC_MASK;
+	tmp |= corr_ns << FEC_T_INC_CORR_OFFSET;
+	writel(tmp, fep->hwp + FEC_ATIME_INC);
+	writel(corr_period, fep->hwp + FEC_ATIME_CORR);
+	/* dummy read to update the timer. */
+	timecounter_read(&fep->tc);
 
 	spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 
@@ -190,12 +217,19 @@ static int fec_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	    container_of(ptp, struct fec_enet_private, ptp_caps);
 	unsigned long flags;
 	u64 now;
+	u32 counter;
 
 	spin_lock_irqsave(&fep->tmreg_lock, flags);
 
 	now = timecounter_read(&fep->tc);
 	now += delta;
 
+	/* Get the timer value based on adjusted timestamp.
+	 * Update the counter with the masked value.
+	 */
+	counter = now & fep->cc.mask;
+	writel(counter, fep->hwp + FEC_ATIME);
+
 	/* reset the timecounter */
 	timecounter_init(&fep->tc, &fep->cc, now);
 
@@ -246,6 +280,7 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp,
 
 	u64 ns;
 	unsigned long flags;
+	u32 counter;
 
 	mutex_lock(&fep->ptp_clk_mutex);
 	/* Check the ptp clock */
@@ -256,8 +291,13 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp,
 
 	ns = ts->tv_sec * 1000000000ULL;
 	ns += ts->tv_nsec;
+	/* Get the timer value based on timestamp.
+	 * Update the counter with the masked value.
+	 */
+	counter = ns & fep->cc.mask;
 
 	spin_lock_irqsave(&fep->tmreg_lock, flags);
+	writel(counter, fep->hwp + FEC_ATIME);
 	timecounter_init(&fep->tc, &fep->cc, ns);
 	spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 	mutex_unlock(&fep->ptp_clk_mutex);
@@ -396,6 +436,7 @@ void fec_ptp_init(struct platform_device *pdev)
 	fep->ptp_caps.enable = fec_ptp_enable;
 
 	fep->cycle_speed = clk_get_rate(fep->clk_ptp);
+	fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed;
 
 	spin_lock_init(&fep->tmreg_lock);
 
-- 
1.9.1

^ permalink raw reply related

* [PATCH v3 0/3] Enable FEC pps feather
From: Luwei Zhou @ 2014-10-10  5:15 UTC (permalink / raw)
  To: davem, richardcochran
  Cc: netdev, shawn.guo, bhutchings, R49496, b38611, b20596, stephen

Change from v2 to v3:
	-Using the default channel 0 to be PPS channel not PTP_PIN_SET/GETFUNC interface.
	-Using the linux definition of NSEC_PER_SEC.

Change from v1 to v2:
	- Fix the potential 32-bit multiplication overflow issue.
	- Optimize the hareware adjustment code to improve efficiency as Richard suggested
	- Use ptp PTP_PIN_SET/GETFUNC interface to set PPS channel not device tree
	and add PTP_PF_PPS enumeration
	- Modify comments style


Luwei Zhou (3):
  net: fec: ptp: Use the 31-bit ptp timer.
  net: fec: ptp: Use hardware algorithm to adjust PTP counter.
  net: fec: ptp: Enable PPS output based on ptp clock

 drivers/net/ethernet/freescale/fec.h      |  10 ++
 drivers/net/ethernet/freescale/fec_main.c |   2 +
 drivers/net/ethernet/freescale/fec_ptp.c  | 272 ++++++++++++++++++++++++++++--
 3 files changed, 267 insertions(+), 17 deletions(-)

-- 
1.9.1

^ permalink raw reply

* [PATCH v3 3/3] net: fec: ptp: Enable PPS output based on ptp clock
From: Luwei Zhou @ 2014-10-10  5:15 UTC (permalink / raw)
  To: davem, richardcochran
  Cc: netdev, shawn.guo, bhutchings, R49496, b38611, b20596, stephen
In-Reply-To: <1412918130-18830-1-git-send-email-b45643@freescale.com>

FEC ptp timer has 4 channel compare/trigger function. It can be used to
enable pps output.
The pulse would be ouput high exactly on N second. The pulse ouput high
on compare event mode is used to produce pulse per second.  The pulse
width would be one cycle based on ptp timer clock source.Since 31-bit
ptp hardware timer is used, the timer will wrap more than 2 seconds. We
need to reload the compare compare event about every 1 second.

Signed-off-by: Luwei Zhou <b45643@freescale.com>
---
 drivers/net/ethernet/freescale/fec.h      |   7 ++
 drivers/net/ethernet/freescale/fec_main.c |   2 +
 drivers/net/ethernet/freescale/fec_ptp.c  | 197 +++++++++++++++++++++++++++++-
 3 files changed, 205 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index b0e6025..1e65917 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -487,12 +487,19 @@ struct fec_enet_private {
 
 	/* ptp clock period in ns*/
 	unsigned int ptp_inc;
+
+	/* pps  */
+	int pps_channel;
+	unsigned int reload_period;
+	int pps_enable;
+	unsigned int next_counter;
 };
 
 void fec_ptp_init(struct platform_device *pdev);
 void fec_ptp_start_cyclecounter(struct net_device *ndev);
 int fec_ptp_set(struct net_device *ndev, struct ifreq *ifr);
 int fec_ptp_get(struct net_device *ndev, struct ifreq *ifr);
+uint fec_ptp_check_pps_event(struct fec_enet_private *fep);
 
 /****************************************************************************/
 #endif /* FEC_H */
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 87975b5..0167601 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1622,6 +1622,8 @@ fec_enet_interrupt(int irq, void *dev_id)
 		complete(&fep->mdio_done);
 	}
 
+	fec_ptp_check_pps_event(fep);
+
 	return ret;
 }
 
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index f5ee460..0fdcdc9 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -61,6 +61,24 @@
 #define FEC_T_INC_CORR_MASK             0x00007f00
 #define FEC_T_INC_CORR_OFFSET           8
 
+#define FEC_T_CTRL_PINPER		0x00000080
+#define FEC_T_TF0_MASK			0x00000001
+#define FEC_T_TF0_OFFSET		0
+#define FEC_T_TF1_MASK			0x00000002
+#define FEC_T_TF1_OFFSET		1
+#define FEC_T_TF2_MASK			0x00000004
+#define FEC_T_TF2_OFFSET		2
+#define FEC_T_TF3_MASK			0x00000008
+#define FEC_T_TF3_OFFSET		3
+#define FEC_T_TDRE_MASK			0x00000001
+#define FEC_T_TDRE_OFFSET		0
+#define FEC_T_TMODE_MASK		0x0000003C
+#define FEC_T_TMODE_OFFSET		2
+#define FEC_T_TIE_MASK			0x00000040
+#define FEC_T_TIE_OFFSET		6
+#define FEC_T_TF_MASK			0x00000080
+#define FEC_T_TF_OFFSET			7
+
 #define FEC_ATIME_CTRL		0x400
 #define FEC_ATIME		0x404
 #define FEC_ATIME_EVT_OFFSET	0x408
@@ -69,8 +87,143 @@
 #define FEC_ATIME_INC		0x414
 #define FEC_TS_TIMESTAMP	0x418
 
+#define FEC_TGSR		0x604
+#define FEC_TCSR(n)		(0x608 + n * 0x08)
+#define FEC_TCCR(n)		(0x60C + n * 0x08)
+#define MAX_TIMER_CHANNEL	3
+#define FEC_TMODE_TOGGLE	0x05
+#define FEC_HIGH_PULSE		0x0F
+
 #define FEC_CC_MULT	(1 << 31)
 #define FEC_COUNTER_PERIOD	(1 << 31)
+#define PPS_OUPUT_RELOAD_PERIOD	NSEC_PER_SEC
+#define FEC_CHANNLE_0		0
+#define DEFAULT_PPS_CHANNEL	FEC_CHANNLE_0
+
+/**
+ * fec_ptp_enable_pps
+ * @fep: the fec_enet_private structure handle
+ * @enable: enable the channel pps output
+ *
+ * This function enble the PPS ouput on the timer channel.
+ */
+static int fec_ptp_enable_pps(struct fec_enet_private *fep, uint enable)
+{
+	unsigned long flags;
+	u32 val, tempval;
+	int inc;
+	struct timespec ts;
+	u64 ns;
+	u32 remainder;
+	val = 0;
+
+	if (!(fep->hwts_tx_en || fep->hwts_rx_en)) {
+		dev_err(&fep->pdev->dev, "No ptp stack is running\n");
+		return -EINVAL;
+	}
+
+	if (fep->pps_enable == enable)
+		return 0;
+
+	fep->pps_channel = DEFAULT_PPS_CHANNEL;
+	fep->reload_period = PPS_OUPUT_RELOAD_PERIOD;
+	inc = fep->ptp_inc;
+
+	spin_lock_irqsave(&fep->tmreg_lock, flags);
+
+	if (enable) {
+		/* clear capture or output compare interrupt status if have.
+		 */
+		writel(FEC_T_TF_MASK, fep->hwp + FEC_TCSR(fep->pps_channel));
+
+		/* It is recommended to doulbe check the TMODE field in the
+		 * TCSR register to be cleared before the first compare counter
+		 * is written into TCCR register. Just add a double check.
+		 */
+		val = readl(fep->hwp + FEC_TCSR(fep->pps_channel));
+		do {
+			val &= ~(FEC_T_TMODE_MASK);
+			writel(val, fep->hwp + FEC_TCSR(fep->pps_channel));
+			val = readl(fep->hwp + FEC_TCSR(fep->pps_channel));
+		} while (val & FEC_T_TMODE_MASK);
+
+		/* Dummy read counter to update the counter */
+		timecounter_read(&fep->tc);
+		/* We want to find the first compare event in the next
+		 * second point. So we need to know what the ptp time
+		 * is now and how many nanoseconds is ahead to get next second.
+		 * The remaining nanosecond ahead before the next second would be
+		 * NSEC_PER_SEC - ts.tv_nsec. Add the remaining nanoseconds
+		 * to current timer would be next second.
+		 */
+		tempval = readl(fep->hwp + FEC_ATIME_CTRL);
+		tempval |= FEC_T_CTRL_CAPTURE;
+		writel(tempval, fep->hwp + FEC_ATIME_CTRL);
+
+		tempval = readl(fep->hwp + FEC_ATIME);
+		/* Convert the ptp local counter to 1588 timestamp */
+		ns = timecounter_cyc2time(&fep->tc, tempval);
+		ts.tv_sec = div_u64_rem(ns, 1000000000ULL, &remainder);
+		ts.tv_nsec = remainder;
+
+		/* The tempval is  less than 3 seconds, and  so val is less than
+		 * 4 seconds. No overflow for 32bit calculation.
+		 */
+		val = NSEC_PER_SEC - (u32)ts.tv_nsec + tempval;
+
+		/* Need to consider the situation that the current time is
+		 * very close to the second point, which means NSEC_PER_SEC
+		 * - ts.tv_nsec is close to be zero(For example 20ns); Since the timer
+		 * is still running when we calculate the first compare event, it is
+		 * possible that the remaining nanoseonds run out before the compare
+		 * counter is calculated and written into TCCR register. To avoid
+		 * this possibility, we will set the compare event to be the next
+		 * of next second. The current setting is 31-bit timer and wrap
+		 * around over 2 seconds. So it is okay to set the next of next
+		 * seond for the timer.
+		 */
+		val += NSEC_PER_SEC;
+
+		/* We add (2 * NSEC_PER_SEC - (u32)ts.tv_nsec) to current
+		 * ptp counter, which maybe cause 32-bit wrap. Since the
+		 * (NSEC_PER_SEC - (u32)ts.tv_nsec) is less than 2 second.
+		 * We can ensure the wrap will not cause issue. If the offset
+		 * is bigger than fep->cc.mask would be a error.
+		 */
+		val &= fep->cc.mask;
+		writel(val, fep->hwp + FEC_TCCR(fep->pps_channel));
+
+		/* Calculate the second the compare event timestamp */
+		fep->next_counter = (val + fep->reload_period) & fep->cc.mask;
+
+		/* * Enable compare event when overflow */
+		val = readl(fep->hwp + FEC_ATIME_CTRL);
+		val |= FEC_T_CTRL_PINPER;
+		writel(val, fep->hwp + FEC_ATIME_CTRL);
+
+		/* Compare channel setting. */
+		val = readl(fep->hwp + FEC_TCSR(fep->pps_channel));
+		val |= (1 << FEC_T_TF_OFFSET | 1 << FEC_T_TIE_OFFSET);
+		val &= ~(1 << FEC_T_TDRE_OFFSET);
+		val &= ~(FEC_T_TMODE_MASK);
+		val |= (FEC_HIGH_PULSE << FEC_T_TMODE_OFFSET);
+		writel(val, fep->hwp + FEC_TCSR(fep->pps_channel));
+
+		/* Write the second compare event timestamp and calculate
+		 * the third timestamp. Refer the TCCR register detail in the spec.
+		 */
+		writel(fep->next_counter, fep->hwp + FEC_TCCR(fep->pps_channel));
+		fep->next_counter = (fep->next_counter + fep->reload_period) & fep->cc.mask;
+	} else {
+		writel(0, fep->hwp + FEC_TCSR(fep->pps_channel));
+	}
+
+	fep->pps_enable = enable;
+	spin_unlock_irqrestore(&fep->tmreg_lock, flags);
+
+	return 0;
+}
+
 /**
  * fec_ptp_read - read raw cycle counter (to be used by time counter)
  * @cc: the cyclecounter structure
@@ -314,6 +467,15 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp,
 static int fec_ptp_enable(struct ptp_clock_info *ptp,
 			  struct ptp_clock_request *rq, int on)
 {
+	struct fec_enet_private *fep =
+	    container_of(ptp, struct fec_enet_private, ptp_caps);
+	int ret = 0;
+
+	if (rq->type == PTP_CLK_REQ_PPS) {
+		ret = fec_ptp_enable_pps(fep, on);
+
+		return ret;
+	}
 	return -EOPNOTSUPP;
 }
 
@@ -428,7 +590,7 @@ void fec_ptp_init(struct platform_device *pdev)
 	fep->ptp_caps.n_ext_ts = 0;
 	fep->ptp_caps.n_per_out = 0;
 	fep->ptp_caps.n_pins = 0;
-	fep->ptp_caps.pps = 0;
+	fep->ptp_caps.pps = 1;
 	fep->ptp_caps.adjfreq = fec_ptp_adjfreq;
 	fep->ptp_caps.adjtime = fec_ptp_adjtime;
 	fep->ptp_caps.gettime = fec_ptp_gettime;
@@ -452,3 +614,36 @@ void fec_ptp_init(struct platform_device *pdev)
 
 	schedule_delayed_work(&fep->time_keep, HZ);
 }
+
+/**
+ * fec_ptp_check_pps_event
+ * @fep: the fec_enet_private structure handle
+ *
+ * This function check the pps event and reload the timer compare counter.
+ */
+uint fec_ptp_check_pps_event(struct fec_enet_private *fep)
+{
+	u32 val;
+	u8 channel = fep->pps_channel;
+	struct ptp_clock_event event;
+
+	val = readl(fep->hwp + FEC_TCSR(channel));
+	if (val & FEC_T_TF_MASK) {
+		/* Write the next next compare(not the next according the spec)
+		 * value to the register
+		 */
+		writel(fep->next_counter, fep->hwp + FEC_TCCR(channel));
+		do {
+			writel(val, fep->hwp + FEC_TCSR(channel));
+		} while (readl(fep->hwp + FEC_TCSR(channel)) & FEC_T_TF_MASK);
+
+		/* Update the counter; */
+		fep->next_counter = (fep->next_counter + fep->reload_period) & fep->cc.mask;
+
+		event.type = PTP_CLOCK_PPS;
+		ptp_clock_event(fep->ptp_clock, &event);
+		return 1;
+	}
+
+	return 0;
+}
-- 
1.9.1

^ permalink raw reply related

* [PATCH v3 1/3] net: fec: ptp: Use the 31-bit ptp timer.
From: Luwei Zhou @ 2014-10-10  5:15 UTC (permalink / raw)
  To: davem, richardcochran
  Cc: netdev, shawn.guo, bhutchings, R49496, b38611, b20596, stephen
In-Reply-To: <1412918130-18830-1-git-send-email-b45643@freescale.com>

When ptp switches from software adjustment to hardware ajustment, linux ptp can't converge.
It is caused by the IP limit. Hardware adjustment logcial have issue when ptp counter
runs over 0x80000000(31 bit counter). The internal IP reference manual already remove 32bit
free-running count support. This patch replace the 32-bit PTP timer with 31-bit.

Signed-off-by: Luwei Zhou <b45643@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
---
 drivers/net/ethernet/freescale/fec_ptp.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index cca3617..8016bdd 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -70,6 +70,7 @@
 #define FEC_TS_TIMESTAMP	0x418
 
 #define FEC_CC_MULT	(1 << 31)
+#define FEC_COUNTER_PERIOD	(1 << 31)
 /**
  * fec_ptp_read - read raw cycle counter (to be used by time counter)
  * @cc: the cyclecounter structure
@@ -113,14 +114,15 @@ void fec_ptp_start_cyclecounter(struct net_device *ndev)
 	/* 1ns counter */
 	writel(inc << FEC_T_INC_OFFSET, fep->hwp + FEC_ATIME_INC);
 
-	/* use free running count */
-	writel(0, fep->hwp + FEC_ATIME_EVT_PERIOD);
+	/* use 31-bit timer counter */
+	writel(FEC_COUNTER_PERIOD, fep->hwp + FEC_ATIME_EVT_PERIOD);
 
-	writel(FEC_T_CTRL_ENABLE, fep->hwp + FEC_ATIME_CTRL);
+	writel(FEC_T_CTRL_ENABLE | FEC_T_CTRL_PERIOD_RST,
+		fep->hwp + FEC_ATIME_CTRL);
 
 	memset(&fep->cc, 0, sizeof(fep->cc));
 	fep->cc.read = fec_ptp_read;
-	fep->cc.mask = CLOCKSOURCE_MASK(32);
+	fep->cc.mask = CLOCKSOURCE_MASK(31);
 	fep->cc.shift = 31;
 	fep->cc.mult = FEC_CC_MULT;
 
-- 
1.9.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox