Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net/udp: do not touch skb->peeked unless really needed
From: Hannes Frederic Sowa @ 2016-12-07 17:37 UTC (permalink / raw)
  To: Eric Dumazet, David Laight
  Cc: Paolo Abeni, David Miller, netdev, Willem de Bruijn
In-Reply-To: <1481131924.4930.40.camel@edumazet-glaptop3.roam.corp.google.com>

On Wed, Dec 7, 2016, at 18:32, Eric Dumazet wrote:
> On Wed, 2016-12-07 at 17:09 +0000, David Laight wrote:
> > From: Paolo Abeni
> > > Sent: 06 December 2016 17:08
> > ...
> > > @@ -79,6 +82,9 @@ struct udp_sock {
> > >  	int			(*gro_complete)(struct sock *sk,
> > >  						struct sk_buff *skb,
> > >  						int nhoff);
> > > +
> > > +	/* since we are prone to drops, avoid dirtying any sk cacheline */
> > > +	atomic_t		drops ____cacheline_aligned_in_smp;
> > >  };
> > 
> > Isn't that likely to create a large hole on systems with large cache lines.
> > (Same as any other use of ____cacheline_aligned_in_smp.)
> 
> Yes, I would like to avoid that, unless we come to the conclusion it is
> absolutely needed.
> 
> I feel that we could simply use a pointer, and allocate memory on
> demand, since many sockets do not ever experience a drop.
> 
> The pointer could stay in a read mostly section.
> 
> We even could use per cpu or node counter for some heavy drop cases. 

I had the same idea while discussing that with Paolo, merely using an
*atomic_t = kmalloc(sizeof(atomic_t)) out of band of the socket.

My fear was that those could be aggregated by the slab cache into one
cache line, causing even more heating on cachelines.

Bye,
Hannes

^ permalink raw reply

* Re: commit : ppp: add rtnetlink device creation support - breaks netcf on my machine.
From: Thomas Haller @ 2016-12-07 17:43 UTC (permalink / raw)
  To: Dan Williams, Guillaume Nault, Brad Campbell
  Cc: netdev, Thomas Graf, David Miller
In-Reply-To: <1481065966.11028.3.camel@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 642 bytes --]

On Tue, 2016-12-06 at 17:12 -0600, Dan Williams wrote:
> 
> > libnl1 rejects the IFLA_INFO_DATA attribute because it expects it
> > to
> > contain a sub-attribute. Since the payload size is zero it doesn't
> > match the policy and parsing fails.
> > 
> > There's no problem with libnl3 because its policy accepts empty
> > payloads for NLA_NESTED attributes (see libnl3 commit 4be02ace4826

Hi,

libnl1 is unmaintained these days. I don't think it makes sense to
backport that patch. The last upstream release was 3+ years ago, with
no upstream development since then.

IMHO netcf should drop libnl-1 support.

best,
Thomas

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: [PATCH net-next] of: add optional naming of interfaces
From: Florian Fainelli @ 2016-12-07 17:49 UTC (permalink / raw)
  To: Volodymyr Bendiuga, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, frowand.list-Re5JQEeQqe8AvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
	volodymyr.bendiuga-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Jonas Johansson, Mattias Walström
In-Reply-To: <1481116349-20678-1-git-send-email-volodymyr.bendiuga-qeDNsGSBLoYwFerOooGFRg@public.gmane.org>

On 12/07/2016 05:12 AM, Volodymyr Bendiuga wrote:
> From: Jonas Johansson <jonas.johansson-qeDNsGSBLoYwFerOooGFRg@public.gmane.org>
> 
> Signed-off-by: Mattias Walström <lazzer-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Jonas Johansson <jonas.johansson-qeDNsGSBLoYwFerOooGFRg@public.gmane.org>

This does not belong to the Device Tree, there should be plenty of
information in user-space to make an educated device rename. I
definitively understand that some drivers (e.g: dsa) do actually get
their interface name from Device Tree directly (label property), but
this is probably the one and only case where this may be tolerated.

Besides, if you submit such a change, you would want to also provide a
consumer of that API to illustrate how this is used.
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net/udp: do not touch skb->peeked unless really needed
From: Eric Dumazet @ 2016-12-07 17:52 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Laight, Paolo Abeni, David Miller, netdev, Willem de Bruijn
In-Reply-To: <1481132267.1541189.811630457.167E9C56@webmail.messagingengine.com>

On Wed, 2016-12-07 at 18:37 +0100, Hannes Frederic Sowa wrote:

> I had the same idea while discussing that with Paolo, merely using an
> *atomic_t = kmalloc(sizeof(atomic_t)) out of band of the socket.
> 
> My fear was that those could be aggregated by the slab cache into one
> cache line, causing even more heating on cachelines.

For hot stuff, better use kmalloc(max_t(size_t, 
                                        L1_CACHE_BYTES,
                                        sizeof(...)) 
to avoid false sharing, unless this is per cpu data of course.

^ permalink raw reply

* Re: [PATCH] net/udp: do not touch skb->peeked unless really needed
From: Eric Dumazet @ 2016-12-07 17:55 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Laight, Paolo Abeni, David Miller, netdev, Willem de Bruijn
In-Reply-To: <1481132267.1541189.811630457.167E9C56@webmail.messagingengine.com>

On Wed, 2016-12-07 at 18:37 +0100, Hannes Frederic Sowa wrote:

> I had the same idea while discussing that with Paolo, merely using an
> *atomic_t = kmalloc(sizeof(atomic_t)) out of band of the socket.
> 
> My fear was that those could be aggregated by the slab cache into one
> cache line, causing even more heating on cachelines.

My exact idea was to let up to 4095 (or PAGE_SIZE - 1) increments being
done on the counter before switching to dynamically allocated memory.

( Some packets might be dropped by TCP sockets, not necessarily a sign
of an attack. just some spurious retransmits )

^ permalink raw reply

* Re: [PATCH] net: return value of skb_linearize should be handled in Linux kernel
From: Cong Wang @ 2016-12-07 17:57 UTC (permalink / raw)
  To: Zhouyi Zhou
  Cc: faisal.latif, dledford, sean.hefty, Hal Rosenstock, Jeff Kirsher,
	QLogic-Storage-Upstream, jejb, Martin K. Petersen,
	Johannes Thumshirn, jon.maloy, ying.xue, David Miller, linux-rdma,
	LKML, intel-wired-lan, Linux Kernel Network Developers,
	linux-scsi, fcoe-devel, tipc-discussion
In-Reply-To: <CAABZP2w_YsXgzZ0tH=r3mnLUbVfpY8Xn3vk9XoS_Hq1r8aeNUQ@mail.gmail.com>

On Tue, Dec 6, 2016 at 10:27 PM, Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> On Wed, Dec 7, 2016 at 1:02 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Mon, Dec 5, 2016 at 11:10 PM, Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
>>> index 2a653ec..ab787cb 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
>>> @@ -490,7 +490,11 @@ int ixgbe_fcoe_ddp(struct ixgbe_adapter *adapter,
>>>          */
>>>         if ((fh->fh_r_ctl == FC_RCTL_DD_SOL_DATA) &&
>>>             (fctl & FC_FC_END_SEQ)) {
>>> -               skb_linearize(skb);
>>> +               int err = 0;
>>> +
>>> +               err = skb_linearize(skb);
>>> +               if (err)
>>> +                       return err;
>>
>>
>> You can reuse 'rc' instead of adding 'err'.
> rc here is meaningful for the length of data being ddped. If using rc
> here, a successful
> skb_linearize will assign rc to 0.

Right, I thought it returns 0 on success.


>>
>>
>>
>>>                 crc = (struct fcoe_crc_eof *)skb_put(skb, sizeof(*crc));
>>>                 crc->fcoe_eof = FC_EOF_T;
>>>         }
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index fee1f29..4926d48 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -2173,8 +2173,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
>>>                                 total_rx_bytes += ddp_bytes;
>>>                                 total_rx_packets += DIV_ROUND_UP(ddp_bytes,
>>>                                                                  mss);
>>> -                       }
>>> -                       if (!ddp_bytes) {
>>> +                       } else {
>>>                                 dev_kfree_skb_any(skb);
>>>                                 continue;
>>>                         }
>>
>>
>> This piece doesn't seem to be related.
> if ddp_bytes is negative there will be some error, I think the skb
> should not pass to upper layer.

You misunderstand my point, this return value is for ixgbe_fcoe_ddp()
not skb_linearize(), you need to make it a separate patch because this
patch, as in $subject, only fixes skb_linearize().

^ permalink raw reply

* [PATCH] net: pch_gbe: Fix TX RX descriptor accesses for big endian systems
From: Hassan Naveed @ 2016-12-07 17:58 UTC (permalink / raw)
  To: netdev
  Cc: Hassan Naveed, Paul Burton, Matt Redfearn, David S. Miller,
	Florian Westphal, françois romieu

Fix pch_gbe driver for ethernet operations for a big endian CPU.
Values written to and read from transmit and receive descriptors
in the pch_gbe driver are byte swapped from the perspective of a
big endian CPU, since the ethernet controller always operates in
little endian mode. Rectify this by appropriately byte swapping
these descriptor field values in the driver software.

Signed-off-by: Hassan Naveed <hassan.naveed@imgtec.com>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Westphal <fw@strlen.de>
Cc: françois romieu <romieu@fr.zoreil.com>
---
 .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   | 66 ++++++++++++----------
 1 file changed, 35 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index d1048dd..6937169 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -1250,11 +1250,11 @@ static void pch_gbe_tx_queue(struct pch_gbe_adapter *adapter,
 
 	/*-- Set Tx descriptor --*/
 	tx_desc = PCH_GBE_TX_DESC(*tx_ring, ring_num);
-	tx_desc->buffer_addr = (buffer_info->dma);
-	tx_desc->length = skb->len;
-	tx_desc->tx_words_eob = skb->len + 3;
-	tx_desc->tx_frame_ctrl = (frame_ctrl);
-	tx_desc->gbec_status = (DSC_INIT16);
+	tx_desc->buffer_addr = cpu_to_le32(buffer_info->dma);
+	tx_desc->length = cpu_to_le16(skb->len);
+	tx_desc->tx_words_eob = cpu_to_le16(skb->len + 3);
+	tx_desc->tx_frame_ctrl = cpu_to_le16(frame_ctrl);
+	tx_desc->gbec_status = cpu_to_le16(DSC_INIT16);
 
 	if (unlikely(++ring_num == tx_ring->count))
 		ring_num = 0;
@@ -1460,8 +1460,8 @@ static irqreturn_t pch_gbe_intr(int irq, void *data)
 		}
 		buffer_info->mapped = true;
 		rx_desc = PCH_GBE_RX_DESC(*rx_ring, i);
-		rx_desc->buffer_addr = (buffer_info->dma);
-		rx_desc->gbec_status = DSC_INIT16;
+		rx_desc->buffer_addr = cpu_to_le32(buffer_info->dma);
+		rx_desc->gbec_status = cpu_to_le16(DSC_INIT16);
 
 		netdev_dbg(netdev,
 			   "i = %d  buffer_info->dma = 0x08%llx  buffer_info->length = 0x%x\n",
@@ -1533,7 +1533,7 @@ static void pch_gbe_alloc_tx_buffers(struct pch_gbe_adapter *adapter,
 		skb_reserve(skb, PCH_GBE_DMA_ALIGN);
 		buffer_info->skb = skb;
 		tx_desc = PCH_GBE_TX_DESC(*tx_ring, i);
-		tx_desc->gbec_status = (DSC_INIT16);
+		tx_desc->gbec_status = cpu_to_le16(DSC_INIT16);
 	}
 	return;
 }
@@ -1564,11 +1564,12 @@ static void pch_gbe_alloc_tx_buffers(struct pch_gbe_adapter *adapter,
 	i = tx_ring->next_to_clean;
 	tx_desc = PCH_GBE_TX_DESC(*tx_ring, i);
 	netdev_dbg(adapter->netdev, "gbec_status:0x%04x  dma_status:0x%04x\n",
-		   tx_desc->gbec_status, tx_desc->dma_status);
+		   le16_to_cpu(tx_desc->gbec_status), tx_desc->dma_status);
 
 	unused = PCH_GBE_DESC_UNUSED(tx_ring);
 	thresh = tx_ring->count - PCH_GBE_TX_WEIGHT;
-	if ((tx_desc->gbec_status == DSC_INIT16) && (unused < thresh))
+	if ((le16_to_cpu(tx_desc->gbec_status) == DSC_INIT16) &&
+		(unused < thresh))
 	{  /* current marked clean, tx queue filling up, do extra clean */
 		int j, k;
 		if (unused < 8) {  /* tx queue nearly full */
@@ -1583,47 +1584,49 @@ static void pch_gbe_alloc_tx_buffers(struct pch_gbe_adapter *adapter,
 		for (j = 0; j < PCH_GBE_TX_WEIGHT; j++)
 		{
 			tx_desc = PCH_GBE_TX_DESC(*tx_ring, k);
-			if (tx_desc->gbec_status != DSC_INIT16) break; /*found*/
+			if (le16_to_cpu(tx_desc->gbec_status) != DSC_INIT16)
+				break; /*found*/
 			if (++k >= tx_ring->count) k = 0;  /*increment, wrap*/
 		}
 		if (j < PCH_GBE_TX_WEIGHT) {
 			netdev_dbg(adapter->netdev,
 				   "clean_tx: unused=%d loops=%d found tx_desc[%x,%x:%x].gbec_status=%04x\n",
 				   unused, j, i, k, tx_ring->next_to_use,
-				   tx_desc->gbec_status);
+				   le16_to_cpu(tx_desc->gbec_status));
 			i = k;  /*found one to clean, usu gbec_status==2000.*/
 		}
 	}
 
-	while ((tx_desc->gbec_status & DSC_INIT16) == 0x0000) {
+	while ((cpu_to_le16(tx_desc->gbec_status) & DSC_INIT16) == 0x0000) {
 		netdev_dbg(adapter->netdev, "gbec_status:0x%04x\n",
-			   tx_desc->gbec_status);
+			   le16_to_cpu(tx_desc->gbec_status));
 		buffer_info = &tx_ring->buffer_info[i];
 		skb = buffer_info->skb;
 		cleaned = true;
 
-		if ((tx_desc->gbec_status & PCH_GBE_TXD_GMAC_STAT_ABT)) {
+		if ((le16_to_cpu(tx_desc->gbec_status) &
+			PCH_GBE_TXD_GMAC_STAT_ABT)) {
 			adapter->stats.tx_aborted_errors++;
 			netdev_err(adapter->netdev, "Transfer Abort Error\n");
-		} else if ((tx_desc->gbec_status & PCH_GBE_TXD_GMAC_STAT_CRSER)
-			  ) {
+		} else if ((le16_to_cpu(tx_desc->gbec_status) &
+				PCH_GBE_TXD_GMAC_STAT_CRSER)) {
 			adapter->stats.tx_carrier_errors++;
 			netdev_err(adapter->netdev,
 				   "Transfer Carrier Sense Error\n");
-		} else if ((tx_desc->gbec_status & PCH_GBE_TXD_GMAC_STAT_EXCOL)
-			  ) {
+		} else if ((le16_to_cpu(tx_desc->gbec_status) &
+					PCH_GBE_TXD_GMAC_STAT_EXCOL)) {
 			adapter->stats.tx_aborted_errors++;
 			netdev_err(adapter->netdev,
 				   "Transfer Collision Abort Error\n");
-		} else if ((tx_desc->gbec_status &
+		} else if ((le16_to_cpu(tx_desc->gbec_status) &
 			    (PCH_GBE_TXD_GMAC_STAT_SNGCOL |
 			     PCH_GBE_TXD_GMAC_STAT_MLTCOL))) {
 			adapter->stats.collisions++;
 			adapter->stats.tx_packets++;
 			adapter->stats.tx_bytes += skb->len;
 			netdev_dbg(adapter->netdev, "Transfer Collision\n");
-		} else if ((tx_desc->gbec_status & PCH_GBE_TXD_GMAC_STAT_CMPLT)
-			  ) {
+		} else if ((le16_to_cpu(tx_desc->gbec_status) &
+				PCH_GBE_TXD_GMAC_STAT_CMPLT)) {
 			adapter->stats.tx_packets++;
 			adapter->stats.tx_bytes += skb->len;
 		}
@@ -1639,7 +1642,7 @@ static void pch_gbe_alloc_tx_buffers(struct pch_gbe_adapter *adapter,
 				   "trim buffer_info->skb : %d\n", i);
 			skb_trim(buffer_info->skb, 0);
 		}
-		tx_desc->gbec_status = DSC_INIT16;
+		tx_desc->gbec_status = cpu_to_le16(DSC_INIT16);
 		if (unlikely(++i == tx_ring->count))
 			i = 0;
 		tx_desc = PCH_GBE_TX_DESC(*tx_ring, i);
@@ -1705,15 +1708,15 @@ static void pch_gbe_alloc_tx_buffers(struct pch_gbe_adapter *adapter,
 	while (*work_done < work_to_do) {
 		/* Check Rx descriptor status */
 		rx_desc = PCH_GBE_RX_DESC(*rx_ring, i);
-		if (rx_desc->gbec_status == DSC_INIT16)
+		if (le16_to_cpu(rx_desc->gbec_status) == DSC_INIT16)
 			break;
 		cleaned = true;
 		cleaned_count++;
 
 		dma_status = rx_desc->dma_status;
-		gbec_status = rx_desc->gbec_status;
-		tcp_ip_status = rx_desc->tcp_ip_status;
-		rx_desc->gbec_status = DSC_INIT16;
+		gbec_status = le16_to_cpu(rx_desc->gbec_status);
+		tcp_ip_status = le32_to_cpu(rx_desc->tcp_ip_status);
+		rx_desc->gbec_status = cpu_to_le16(DSC_INIT16);
 		buffer_info = &rx_ring->buffer_info[i];
 		skb = buffer_info->skb;
 		buffer_info->skb = NULL;
@@ -1742,8 +1745,9 @@ static void pch_gbe_alloc_tx_buffers(struct pch_gbe_adapter *adapter,
 		} else {
 			/* get receive length */
 			/* length convert[-3], length includes FCS length */
-			length = (rx_desc->rx_words_eob) - 3 - ETH_FCS_LEN;
-			if (rx_desc->rx_words_eob & 0x02)
+			length = le16_to_cpu(rx_desc->rx_words_eob) - 3 -
+					ETH_FCS_LEN;
+			if (le16_to_cpu(rx_desc->rx_words_eob) & 0x02)
 				length = length - 4;
 			/*
 			 * buffer_info->rx_buffer: [Header:14][payload]
@@ -1823,7 +1827,7 @@ int pch_gbe_setup_tx_resources(struct pch_gbe_adapter *adapter,
 
 	for (desNo = 0; desNo < tx_ring->count; desNo++) {
 		tx_desc = PCH_GBE_TX_DESC(*tx_ring, desNo);
-		tx_desc->gbec_status = DSC_INIT16;
+		tx_desc->gbec_status = cpu_to_le16(DSC_INIT16);
 	}
 	netdev_dbg(adapter->netdev,
 		   "tx_ring->desc = 0x%p  tx_ring->dma = 0x%08llx next_to_clean = 0x%08x  next_to_use = 0x%08x\n",
@@ -1864,7 +1868,7 @@ int pch_gbe_setup_rx_resources(struct pch_gbe_adapter *adapter,
 	rx_ring->next_to_use = 0;
 	for (desNo = 0; desNo < rx_ring->count; desNo++) {
 		rx_desc = PCH_GBE_RX_DESC(*rx_ring, desNo);
-		rx_desc->gbec_status = DSC_INIT16;
+		rx_desc->gbec_status = cpu_to_le16(DSC_INIT16);
 	}
 	netdev_dbg(adapter->netdev,
 		   "rx_ring->desc = 0x%p  rx_ring->dma = 0x%08llx next_to_clean = 0x%08x  next_to_use = 0x%08x\n",
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next] net: do not read sk_drops if application does not care
From: Eric Dumazet @ 2016-12-07 18:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Paolo Abeni

From: Eric Dumazet <edumazet@google.com>

sk_drops can be an often written field, do not read it unless
application showed interest.

Note that sk_drops can be read via inet_diag, so applications
can avoid getting this info from every received packet.

In the future, 'reading' sk_drops might require folding per node or per
cpu fields, and thus become even more expensive than today.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
---
 include/net/sock.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 1749e38d03014558ac882b5d1fb37b11ac5e6705..be167c1483f4a5a74b466f135bbfdf4281e5bef4 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2160,7 +2160,8 @@ struct sock_skb_cb {
 static inline void
 sock_skb_set_dropcount(const struct sock *sk, struct sk_buff *skb)
 {
-	SOCK_SKB_CB(skb)->dropcount = atomic_read(&sk->sk_drops);
+	SOCK_SKB_CB(skb)->dropcount = sock_flag(sk, SOCK_RXQ_OVFL) ?
+						atomic_read(&sk->sk_drops) : 0;
 }
 
 static inline void sk_drops_add(struct sock *sk, const struct sk_buff *skb)

^ permalink raw reply related

* stmmac driver...
From: David Miller @ 2016-12-07 18:06 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: alexandre.torgue, netdev


Giuseppe and Alexandre,

There are a lot of patches and discussions happening around the stammc
driver lately and both of you are listed as the maintainers.

I really need prompt and conclusive reviews of these patch submissions
from you, and participation in all discussions about the driver.

Otherwise I have only three things I can do: 1) let the patches rot in
patchwork for days 2) trust that the patches are sane and fit your
desires and goals and just apply them or 3) reject them since they
aren't being reviewed properly.

Thanks in advance.

^ permalink raw reply

* Re: [PATCH] net: stmmac: do not call phy_ethtool_ksettings_set from atomic context
From: David Miller @ 2016-12-07 18:09 UTC (permalink / raw)
  To: niklas.cassel
  Cc: peppe.cavallaro, alexandre.torgue, niklass, netdev, linux-kernel
In-Reply-To: <1481032035-19018-1-git-send-email-niklass@axis.com>

From: Niklas Cassel <niklas.cassel@axis.com>
Date: Tue, 6 Dec 2016 14:47:15 +0100

> From: Niklas Cassel <niklas.cassel@axis.com>
> 
> From what I can tell, spin_lock(&priv->lock) is not needed, since the
> phy_ethtool_ksettings_set call is not given the priv struct.
> 
> phy_start_aneg takes the phydev->lock. Calls to phy_adjust_link
> from phy_state_machine also takes the phydev->lock.
...
> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>

Applied, but please always be explicit about what tree you are targetting
this patch by properly annotating for it in your Subject line.

In this case that would be "Subject: [PATCH net-next] ..."

^ permalink raw reply

* Re: [PATCH v2] drivers: net: cpsw-phy-sel: Clear RGMII_IDMODE on "rgmii" links
From: David Miller @ 2016-12-07 18:12 UTC (permalink / raw)
  To: alex.g
  Cc: mugunthanvnm, grygorii.strashko, linux-omap, netdev, linux-kernel,
	gokhan
In-Reply-To: <1481050611-29698-1-git-send-email-alex.g@adaptrum.com>

From: Alexandru Gagniuc <alex.g@adaptrum.com>
Date: Tue,  6 Dec 2016 10:56:51 -0800

> Support for setting the RGMII_IDMODE bit was added in the commit
> referenced below. However, that commit did not add the symmetrical
> clearing of the bit by way of setting it in "mask". Add it here.
> 
> Note that the documentation marks clearing this bit as "reserved",
> however, according to TI, support for delaying the clock does exist in
> the MAC, although it is not officially supported.
> We tested this on a board with an RGMII to RGMII link that will not
> work unless this bit is cleared.
> 
> Fixes: 0fb26c3063ea ("drivers: net: cpsw-phy-sel: add support to configure rgmii internal delay")
> Signed-off-by: Alexandru Gagniuc <alex.g@adaptrum.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] netvsc: reduce maximum GSO size
From: David Miller @ 2016-12-07 18:14 UTC (permalink / raw)
  To: stephen; +Cc: netdev, sthemmin
In-Reply-To: <20161206214354.15473-1-sthemmin@microsoft.com>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Tue,  6 Dec 2016 13:43:54 -0800

> Hyper-V (and Azure) support using NVGRE which requires some extra space
> for encapsulation headers. Because of this the largest allowed TSO
> packet is reduced.
> 
> For older releases, hard code a fixed reduced value.  For next release,
> there is a better solution which uses result of host offload
> negotiation.
> 
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
> Please queue this for stable as well.

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH] sh_eth: add wake-on-lan support via magic packet
From: Geert Uytterhoeven @ 2016-12-07 18:14 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Sergei Shtylyov, Simon Horman, netdev@vger.kernel.org,
	Linux-Renesas
In-Reply-To: <20161207162843.4731-1-niklas.soderlund+renesas@ragnatech.se>

Hi Niklas,

On Wed, Dec 7, 2016 at 5:28 PM, Niklas Söderlund
<niklas.soderlund+renesas@ragnatech.se> wrote:
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

Thanks, works fine on r8a7791/koelsch!

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -624,7 +624,7 @@ static struct sh_eth_cpu_data r8a779x_data = {
>
>         .register_type  = SH_ETH_REG_FAST_RCAR,
>
> -       .ecsr_value     = ECSR_PSRTO | ECSR_LCHNG | ECSR_ICD,
> +       .ecsr_value     = ECSR_PSRTO | ECSR_LCHNG | ECSR_ICD | ECSR_MPD,

Interestingly, the ECSR_MPD bit is already set for several SoCs.

Hence adding ".magic = 1" to the entry for r8a7740 instantly gave me working
WoL support on r8a7740/armadillo. Cool!

> --- a/drivers/net/ethernet/renesas/sh_eth.h
> +++ b/drivers/net/ethernet/renesas/sh_eth.h
> @@ -493,6 +493,7 @@ struct sh_eth_cpu_data {
>         unsigned shift_rd0:1;   /* shift Rx descriptor word 0 right by 16 */
>         unsigned rmiimode:1;    /* EtherC has RMIIMODE register */
>         unsigned rtrate:1;      /* EtherC has RTRATE register */
> +       unsigned magic:1;       /* EtherC have PMDE in ECMR and MPDIP in ECSIPR */

Instead of adding a new flag, perhaps you can just check for the ECSR_MPD flag
in ecsr_value?

> @@ -529,6 +530,9 @@ struct sh_eth_private {
>         unsigned no_ether_link:1;
>         unsigned ether_link_active_low:1;
>         unsigned is_opened:1;
> +
> +       bool wol_enabled;

"unsigned wol_enabled:1", to merge with the bitfield above?

> +       struct clk *clk;

It's a good practice to keep all pointers at the top of the struct, to avoid
gaps due to alignment restrictions, especially on 64-bit (I know that's not
the case here).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH net-next] bpf: fix loading of BPF_MAXINSNS sized programs
From: David Miller @ 2016-12-07 18:18 UTC (permalink / raw)
  To: daniel; +Cc: alexei.starovoitov, netdev
In-Reply-To: <7aad2a49d7a717fd018b338bb9d7167bafd1fdb8.1481069166.git.daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed,  7 Dec 2016 01:15:44 +0100

> General assumption is that single program can hold up to BPF_MAXINSNS,
> that is, 4096 number of instructions. It is the case with cBPF and
> that limit was carried over to eBPF. When recently testing digest, I
> noticed that it's actually not possible to feed 4096 instructions
> via bpf(2).
> 
> The check for > BPF_MAXINSNS was added back then to bpf_check() in
> cbd357008604 ("bpf: verifier (add ability to receive verification log)").
> However, 09756af46893 ("bpf: expand BPF syscall with program load/unload")
> added yet another check that comes before that into bpf_prog_load(),
> but this time bails out already in case of >= BPF_MAXINSNS.
> 
> Fix it up and perform the check early in bpf_prog_load(), so we can drop
> the second one in bpf_check(). It makes sense, because also a 0 insn
> program is useless and we don't want to waste any resources doing work
> up to bpf_check() point. The existing bpf(2) man page documents E2BIG
> as the official error for such cases, so just stick with it as well.
> 
> Fixes: 09756af46893 ("bpf: expand BPF syscall with program load/unload")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  ( net-next is just fine imho. )

Applied.

^ permalink raw reply

* [RFC PATCH net-next v3 1/2] macb: Add 1588 support in Cadence GEM.
From: Andrei Pistirica @ 2016-12-07 18:21 UTC (permalink / raw)
  To: netdev, linux-kernel, linux-arm-kernel, davem, nicolas.ferre,
	harinikatakamlinux, harini.katakam
  Cc: punnaia, michals, anirudh, boris.brezillon, alexandre.belloni,
	tbultel, richardcochran, rafalo, Andrei Pistirica

Cadence GEM provides a 102 bit time counter with 48 bits for seconds,
30 bits for nsecs and 24 bits for sub-nsecs to control 1588 timestamping.

This patch does the following:
- Registers to ptp clock framework
- Timer initialization is done by writing time of day to the timer counter.
- ns increment register is programmed as NSEC_PER_SEC/tsu-clock-rate.
  For a 16 bit subns precision, the subns increment equals
  remainder of (NS_PER_SEC/TSU_CLK) * (2^16).
- Timestamps are obtained from the TX/RX PTP event/PEER registers.
  The timestamp obtained thus is updated in skb for upper layers to access.
- The drivers register functions with ptp to perform time and frequency
  adjustment.
- Time adjustment is done by writing to the 1558_ADJUST register.
  The controller will read the delta in this register and update the timer
  counter register. Alternatively, for large time offset adjustments,
  the driver reads the secs and nsecs counter values, adds/subtracts the
  delta and updates the timer counter.
- Frequency is adjusted by adjusting addend (8bit nanosecond increment) and
  addendsub (16bit increment nanosecond fractions).
  The 102bit counter is incremented at nominal frequency with addend and
  addendsub values. Each period addend and addendsub values are adjusted
  based on ppm drift.

Signed-off-by: Andrei Pistirica <andrei.pistirica@microchip.com>
Signed-off-by: Harini Katakam <harinik@xilinx.com>
---
Patch history:

Version 1:
This patch is based on original Harini's patch, implemented in a
separate file to ease the review/maintanance and integration with
other platforms (e.g. Zynq Ultrascale+ MPSoC).
Feature was tested on SAMA5D2 platform using ptp4l v1.6 from linuxptp
project and also with ptpd2 version 2.3.1. PTP was tested over
IPv4,IPv6 and 802.3 protocols.

In case that macb is compiled as a module, it has been renamed to
cadence-macb.ko to avoid naming confusion in Makefile.

Version 2 modifications:
- bitfields for TSU are named according to SAMA5D2 data sheet
- identify GEM-PTP support based on platform capability
- add spinlock for TSU access
- change macb_ptp_adjfreq and use fewer 64bit divisions

Version 3 modifications:
- new adjfine api with one 64 division for frequency adjustment 
  (based on Richard's input)
- add maximum adjustment frequency (ppb) based on nominal frequency
- per platform PTP configuration
- cosmetic changes
Note 1: Kbuild uses "select" instead of "imply", and the macb maintainer agreed
        to make the change when it will be available in net-next.
Note 2: Guys, this driver does not support GEM-GXL!
Note 3: Patch on net-next, on December 7th.

 drivers/net/ethernet/cadence/Kconfig    |  10 +-
 drivers/net/ethernet/cadence/Makefile   |   8 +-
 drivers/net/ethernet/cadence/macb.h     | 102 +++++++++
 drivers/net/ethernet/cadence/macb_ptp.c | 390 ++++++++++++++++++++++++++++++++
 4 files changed, 508 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/cadence/macb_ptp.c

diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig
index f0bcb15..ebbc65f 100644
--- a/drivers/net/ethernet/cadence/Kconfig
+++ b/drivers/net/ethernet/cadence/Kconfig
@@ -29,6 +29,14 @@ config MACB
 	  support for the MACB/GEM chip.
 
 	  To compile this driver as a module, choose M here: the module
-	  will be called macb.
+	  will be called cadence-macb.
+
+config MACB_USE_HWSTAMP
+	bool "Use IEEE 1588 hwstamp"
+	depends on MACB
+	default y
+	select PTP_1588_CLOCK
+	---help---
+	  Enable IEEE 1588 Precision Time Protocol (PTP) support for MACB.
 
 endif # NET_CADENCE
diff --git a/drivers/net/ethernet/cadence/Makefile b/drivers/net/ethernet/cadence/Makefile
index 91f79b1..4402d42 100644
--- a/drivers/net/ethernet/cadence/Makefile
+++ b/drivers/net/ethernet/cadence/Makefile
@@ -2,4 +2,10 @@
 # Makefile for the Atmel network device drivers.
 #
 
-obj-$(CONFIG_MACB) += macb.o
+cadence-macb-y	:= macb.o
+
+ifeq ($(CONFIG_MACB_USE_HWSTAMP),y)
+cadence-macb-y	+= macb_ptp.o
+endif
+
+obj-$(CONFIG_MACB) += cadence-macb.o
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index d67adad..b3688f1 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -10,6 +10,9 @@
 #ifndef _MACB_H
 #define _MACB_H
 
+#include <linux/ptp_clock.h>
+#include <linux/ptp_clock_kernel.h>
+
 #define MACB_GREGS_NBR 16
 #define MACB_GREGS_VERSION 2
 #define MACB_MAX_QUEUES 8
@@ -131,6 +134,20 @@
 #define GEM_RXIPCCNT		0x01a8 /* IP header Checksum Error Counter */
 #define GEM_RXTCPCCNT		0x01ac /* TCP Checksum Error Counter */
 #define GEM_RXUDPCCNT		0x01b0 /* UDP Checksum Error Counter */
+#define GEM_TISUBN		0x01bc /* 1588 Timer Increment Sub-ns */
+#define GEM_TSH			0x01c0 /* 1588 Timer Seconds High */
+#define GEM_TSL			0x01d0 /* 1588 Timer Seconds Low */
+#define GEM_TN			0x01d4 /* 1588 Timer Nanoseconds */
+#define GEM_TA			0x01d8 /* 1588 Timer Adjust */
+#define GEM_TI			0x01dc /* 1588 Timer Increment */
+#define GEM_EFTSL		0x01e0 /* PTP Event Frame Tx Seconds Low */
+#define GEM_EFTN		0x01e4 /* PTP Event Frame Tx Nanoseconds */
+#define GEM_EFRSL		0x01e8 /* PTP Event Frame Rx Seconds Low */
+#define GEM_EFRN		0x01ec /* PTP Event Frame Rx Nanoseconds */
+#define GEM_PEFTSL		0x01f0 /* PTP Peer Event Frame Tx Secs Low */
+#define GEM_PEFTN		0x01f4 /* PTP Peer Event Frame Tx Ns */
+#define GEM_PEFRSL		0x01f8 /* PTP Peer Event Frame Rx Sec Low */
+#define GEM_PEFRN		0x01fc /* PTP Peer Event Frame Rx Ns */
 #define GEM_DCFG1		0x0280 /* Design Config 1 */
 #define GEM_DCFG2		0x0284 /* Design Config 2 */
 #define GEM_DCFG3		0x0288 /* Design Config 3 */
@@ -174,6 +191,7 @@
 #define MACB_NCR_TPF_SIZE	1
 #define MACB_TZQ_OFFSET		12 /* Transmit zero quantum pause frame */
 #define MACB_TZQ_SIZE		1
+#define MACB_SRTSM_OFFSET	15
 
 /* Bitfields in NCFGR */
 #define MACB_SPD_OFFSET		0 /* Speed */
@@ -319,6 +337,32 @@
 #define MACB_PTZ_SIZE		1
 #define MACB_WOL_OFFSET		14 /* Enable wake-on-lan interrupt */
 #define MACB_WOL_SIZE		1
+#define MACB_DRQFR_OFFSET	18 /* PTP Delay Request Frame Received */
+#define MACB_DRQFR_SIZE		1
+#define MACB_SFR_OFFSET		19 /* PTP Sync Frame Received */
+#define MACB_SFR_SIZE		1
+#define MACB_DRQFT_OFFSET	20 /* PTP Delay Request Frame Transmitted */
+#define MACB_DRQFT_SIZE		1
+#define MACB_SFT_OFFSET		21 /* PTP Sync Frame Transmitted */
+#define MACB_SFT_SIZE		1
+#define MACB_PDRQFR_OFFSET	22 /* PDelay Request Frame Received */
+#define MACB_PDRQFR_SIZE	1
+#define MACB_PDRSFR_OFFSET	23 /* PDelay Response Frame Received */
+#define MACB_PDRSFR_SIZE	1
+#define MACB_PDRQFT_OFFSET	24 /* PDelay Request Frame Transmitted */
+#define MACB_PDRQFT_SIZE	1
+#define MACB_PDRSFT_OFFSET	25 /* PDelay Response Frame Transmitted */
+#define MACB_PDRSFT_SIZE	1
+#define MACB_SRI_OFFSET		26 /* TSU Seconds Register Increment */
+#define MACB_SRI_SIZE		1
+
+/* Timer increment fields */
+#define MACB_TI_CNS_OFFSET	0
+#define MACB_TI_CNS_SIZE	8
+#define MACB_TI_ACNS_OFFSET	8
+#define MACB_TI_ACNS_SIZE	8
+#define MACB_TI_NIT_OFFSET	16
+#define MACB_TI_NIT_SIZE	8
 
 /* Bitfields in MAN */
 #define MACB_DATA_OFFSET	0 /* data */
@@ -386,6 +430,17 @@
 #define GEM_PBUF_LSO_OFFSET			27
 #define GEM_PBUF_LSO_SIZE			1
 
+/* Bitfields in TISUBN */
+#define GEM_SUBNSINCR_OFFSET			0
+#define GEM_SUBNSINCR_SIZE			16
+
+/* Bitfields in TI */
+#define GEM_NSINCR_OFFSET			0
+#define GEM_NSINCR_SIZE				8
+
+/* Bitfields in ADJ */
+#define GEM_ADDSUB_OFFSET			31
+#define GEM_ADDSUB_SIZE				1
 /* Constants for CLK */
 #define MACB_CLK_DIV8				0
 #define MACB_CLK_DIV16				1
@@ -417,6 +472,7 @@
 #define MACB_CAPS_GIGABIT_MODE_AVAILABLE	0x20000000
 #define MACB_CAPS_SG_DISABLED			0x40000000
 #define MACB_CAPS_MACB_IS_GEM			0x80000000
+#define MACB_CAPS_GEM_HAS_PTP			0x00000020
 
 /* LSO settings */
 #define MACB_LSO_UFO_ENABLE			0x01
@@ -782,6 +838,19 @@ struct macb_or_gem_ops {
 	int	(*mog_rx)(struct macb *bp, int budget);
 };
 
+/* MACB-PTP interface: adapt to platform needs and GEM (e.g. GXL). */
+struct macb_ptp_info {
+	void (*ptp_init)(struct net_device *ndev);
+	void (*ptp_remove)(struct net_device *ndev);
+	unsigned int (*get_tsu_rate)(struct macb *bp);
+	int (*get_ts_info)(struct net_device *dev,
+			   struct ethtool_ts_info *info);
+	int (*hwtst_get)(struct net_device *netdev,
+			 struct ifreq *ifr);
+	int (*hwtst_set)(struct net_device *netdev,
+			 struct ifreq *ifr, int cmd);
+};
+
 struct macb_config {
 	u32			caps;
 	unsigned int		dma_burst_length;
@@ -874,11 +943,44 @@ struct macb {
 	unsigned int		jumbo_max_len;
 
 	u32			wol;
+
+	struct macb_ptp_info	*ptp_info;
+#ifdef CONFIG_MACB_USE_HWSTAMP
+	bool			hwts_tx_en;
+	bool			hwts_rx_en;
+	spinlock_t		tsu_clk_lock; /* gem tsu clock locking */
+	unsigned int		tsu_rate;
+
+	struct ptp_clock	*ptp_clock;
+	struct ptp_clock_info	ptp_caps;
+	u32			ns_incr;
+	u32			subns_incr;
+#endif
 };
 
+#ifdef CONFIG_MACB_USE_HWSTAMP
+void gem_ptp_init(struct net_device *ndev);
+void gem_ptp_remove(struct net_device *ndev);
+
+void gem_ptp_do_txstamp(struct macb *bp, struct sk_buff *skb);
+void gem_ptp_do_rxstamp(struct macb *bp, struct sk_buff *skb);
+
+#else
+static inline void gem_ptp_init(struct net_device *ndev) { }
+static inline void gem_ptp_remove(struct net_device *ndev) { }
+
+static inline void gem_ptp_do_txstamp(struct macb *bp, struct sk_buff *skb) { }
+static inline void gem_ptp_do_rxstamp(struct macb *bp, struct sk_buff *skb) { }
+#endif
+
 static inline bool macb_is_gem(struct macb *bp)
 {
 	return !!(bp->caps & MACB_CAPS_MACB_IS_GEM);
 }
 
+static inline bool gem_has_ptp(struct macb *bp)
+{
+	return !!(bp->caps & MACB_CAPS_GEM_HAS_PTP);
+}
+
 #endif /* _MACB_H */
diff --git a/drivers/net/ethernet/cadence/macb_ptp.c b/drivers/net/ethernet/cadence/macb_ptp.c
new file mode 100644
index 0000000..efefd9e
--- /dev/null
+++ b/drivers/net/ethernet/cadence/macb_ptp.c
@@ -0,0 +1,390 @@
+/*
+ * 1588 PTP support for GEM device.
+ *
+ * Copyright (C) 2016 Microchip Technology
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/clk.h>
+#include <linux/device.h>
+#include <linux/etherdevice.h>
+#include <linux/platform_device.h>
+#include <linux/time64.h>
+#include <linux/ptp_classify.h>
+#include <linux/if_ether.h>
+#include <linux/if_vlan.h>
+#include <linux/net_tstamp.h>
+
+#include "macb.h"
+
+#define  GEM_PTP_TIMER_NAME "gem-ptp-timer"
+
+static inline void gem_tsu_get_time(struct macb *bp,
+				    struct timespec64 *ts)
+{
+	u64 sec, sech, secl;
+
+	spin_lock(&bp->tsu_clk_lock);
+
+	/* GEM's internal time */
+	sech = gem_readl(bp, TSH);
+	secl = gem_readl(bp, TSL);
+	ts->tv_nsec = gem_readl(bp, TN);
+	ts->tv_sec = (sech << 32) | secl;
+
+	/* minimize error */
+	sech = gem_readl(bp, TSH);
+	secl = gem_readl(bp, TSL);
+	sec = (sech << 32) | secl;
+	if (ts->tv_sec != sec) {
+		ts->tv_sec = sec;
+		ts->tv_nsec = gem_readl(bp, TN);
+	}
+
+	spin_unlock(&bp->tsu_clk_lock);
+}
+
+static inline void gem_tsu_set_time(struct macb *bp,
+				    const struct timespec64 *ts)
+{
+	u32 ns, sech, secl;
+	s64 word_mask = 0xffffffff;
+
+	sech = (u32)ts->tv_sec;
+	secl = (u32)ts->tv_sec;
+	ns = ts->tv_nsec;
+	if (ts->tv_sec > word_mask)
+		sech = (ts->tv_sec >> 32);
+
+	spin_lock(&bp->tsu_clk_lock);
+
+	/* TSH doesn't latch the time and no atomicity! */
+	gem_writel(bp, TN, 0); /* clear to avoid overflow */
+	gem_writel(bp, TSH, sech);
+	gem_writel(bp, TSL, secl);
+	gem_writel(bp, TN, ns);
+
+	spin_unlock(&bp->tsu_clk_lock);
+}
+
+static int gem_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
+{
+	struct macb *bp = container_of(ptp, struct macb, ptp_caps);
+	u32 word, diff;
+	u64 adj, rate;
+	int neg_adj = 0;
+
+	if (scaled_ppm < 0) {
+		neg_adj = 1;
+		scaled_ppm = -scaled_ppm;
+	}
+	rate = scaled_ppm;
+
+	/* word: unused(8bit) | ns(8bit) | fractions(16bit) */
+	word = (bp->ns_incr << 16) + bp->subns_incr;
+
+	adj = word;
+	adj *= rate;
+	adj >>= 16; /* remove fractions */
+	adj += 500000UL;
+	diff = div_u64(adj, 1000000UL);
+	word = neg_adj ? word - diff : word + diff;
+
+	spin_lock(&bp->tsu_clk_lock);
+
+	gem_writel(bp, TISUBN, GEM_BF(SUBNSINCR, (word & 0xffff)));
+	gem_writel(bp, TI, GEM_BF(NSINCR, (word >> 16)));
+
+	spin_unlock(&bp->tsu_clk_lock);
+	return 0;
+}
+
+static int gem_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
+{
+	struct macb *bp = container_of(ptp, struct macb, ptp_caps);
+	struct timespec64 now, then = ns_to_timespec64(delta);
+	u32 adj, sign = 0;
+
+	if (delta < 0) {
+		delta = -delta;
+		sign = 1;
+	}
+
+	if (delta > 0x3FFFFFFF) {
+		gem_tsu_get_time(bp, &now);
+
+		if (sign)
+			now = timespec64_sub(now, then);
+		else
+			now = timespec64_add(now, then);
+
+		gem_tsu_set_time(bp, (const struct timespec64 *)&now);
+	} else {
+		adj = delta;
+		if (sign)
+			adj |= GEM_BIT(ADDSUB);
+
+		gem_writel(bp, TA, adj);
+	}
+
+	return 0;
+}
+
+static int gem_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
+{
+	struct macb *bp = container_of(ptp, struct macb, ptp_caps);
+
+	gem_tsu_get_time(bp, ts);
+
+	return 0;
+}
+
+static int gem_ptp_settime(struct ptp_clock_info *ptp,
+			   const struct timespec64 *ts)
+{
+	struct macb *bp = container_of(ptp, struct macb, ptp_caps);
+
+	gem_tsu_set_time(bp, ts);
+
+	return 0;
+}
+
+static int gem_ptp_enable(struct ptp_clock_info *ptp,
+			  struct ptp_clock_request *rq, int on)
+{
+	return -EOPNOTSUPP;
+}
+
+static struct ptp_clock_info gem_ptp_caps_template = {
+	.owner		= THIS_MODULE,
+	.name		= GEM_PTP_TIMER_NAME,
+	.max_adj	= 0,
+	.n_alarm	= 0,
+	.n_ext_ts	= 0,
+	.n_per_out	= 0,
+	.n_pins		= 0,
+	.pps		= 0,
+	.adjfine	= gem_ptp_adjfine,
+	.adjtime	= gem_ptp_adjtime,
+	.gettime64	= gem_ptp_gettime,
+	.settime64	= gem_ptp_settime,
+	.enable		= gem_ptp_enable,
+};
+
+static s32 gem_ptp_max_adj(unsigned int f_nom)
+{
+	u64 adj;
+
+	/* The 48 bits of seconds for the GEM overflows every:
+	 * 2^48/(365.25 * 24 * 60 *60) =~ 8 925 512 years (~= 9 mil years),
+	 * thus the maximum adjust frequency must not overflow CNS register:
+	 *
+	 * addend  = 10^9/nominal_freq
+	 * adj_max = +/- addend*ppb_max/10^9
+	 * max_ppb = (2^8-1)*nominal_freq-10^9
+	 */
+	adj = f_nom;
+	adj *= 0xffff;
+	adj -= 1000000000ULL;
+	return adj;
+}
+
+static void gem_ptp_init_timer(struct macb *bp)
+{
+	struct timespec64 now;
+	u32 rem = 0;
+
+	getnstimeofday64(&now);
+	gem_tsu_set_time(bp, (const struct timespec64 *)&now);
+
+	bp->ns_incr = div_u64_rem(NSEC_PER_SEC, bp->tsu_rate, &rem);
+	if (rem) {
+		u64 adj = rem;
+
+		adj <<= 16; /* 16 bits nsec fragments */
+		bp->subns_incr = div_u64(adj, bp->tsu_rate);
+	} else {
+		bp->subns_incr = 0;
+	}
+
+	gem_writel(bp, TISUBN, GEM_BF(SUBNSINCR, bp->subns_incr));
+	gem_writel(bp, TI, GEM_BF(NSINCR, bp->ns_incr));
+	gem_writel(bp, TA, 0);
+}
+
+static void gem_ptp_clear_timer(struct macb *bp)
+{
+	bp->ns_incr = 0;
+	bp->subns_incr = 0;
+
+	gem_writel(bp, TISUBN, GEM_BF(SUBNSINCR, 0));
+	gem_writel(bp, TI, GEM_BF(NSINCR, 0));
+	gem_writel(bp, TA, 0);
+}
+
+/* While GEM can timestamp PTP packets, it does not mark the RX descriptor
+ * to identify them. UDP packets must be parsed to identify PTP packets.
+ *
+ * Note: Inspired from drivers/net/ethernet/ti/cpts.c
+ */
+static int gem_get_ptp_peer(struct sk_buff *skb, int ptp_class)
+{
+	unsigned int offset = 0;
+	u8 *msgtype, *data = skb->data;
+
+	/* PTP frames are rare! */
+	if (likely(ptp_class == PTP_CLASS_NONE))
+		return -1;
+
+	if (ptp_class & PTP_CLASS_VLAN)
+		offset += VLAN_HLEN;
+
+	switch (ptp_class & PTP_CLASS_PMASK) {
+	case PTP_CLASS_IPV4:
+		offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN;
+	break;
+	case PTP_CLASS_IPV6:
+		offset += ETH_HLEN + IP6_HLEN + UDP_HLEN;
+	break;
+	case PTP_CLASS_L2:
+		offset += ETH_HLEN;
+		break;
+
+	/* something went wrong! */
+	default:
+		return -1;
+	}
+
+	if (skb->len + ETH_HLEN < offset + OFF_PTP_SEQUENCE_ID)
+		return -1;
+
+	if (unlikely(ptp_class & PTP_CLASS_V1))
+		msgtype = data + offset + OFF_PTP_CONTROL;
+	else
+		msgtype = data + offset;
+
+	return (*msgtype) & 0x2;
+}
+
+static void gem_ptp_tx_hwtstamp(struct macb *bp, struct sk_buff *skb,
+				int peer_ev)
+{
+	struct skb_shared_hwtstamps *shhwtstamps = skb_hwtstamps(skb);
+	struct timespec64 ts;
+	u64 ns;
+
+	/* PTP Peer Event Frame packets */
+	if (peer_ev) {
+		ts.tv_sec = gem_readl(bp, PEFTSL);
+		ts.tv_nsec = gem_readl(bp, PEFTN);
+
+	/* PTP Event Frame packets */
+	} else {
+		ts.tv_sec = gem_readl(bp, EFTSL);
+		ts.tv_nsec = gem_readl(bp, EFTN);
+	}
+	ns = timespec64_to_ns(&ts);
+
+	memset(shhwtstamps, 0, sizeof(struct skb_shared_hwtstamps));
+	shhwtstamps->hwtstamp = ns_to_ktime(ns);
+	skb_tstamp_tx(skb, skb_hwtstamps(skb));
+}
+
+static void gem_ptp_rx_hwtstamp(struct macb *bp, struct sk_buff *skb,
+				int peer_ev)
+{
+	struct skb_shared_hwtstamps *shhwtstamps = skb_hwtstamps(skb);
+	struct timespec64 ts;
+	u64 ns;
+
+	if (peer_ev) {
+		/* PTP Peer Event Frame packets */
+		ts.tv_sec = gem_readl(bp, PEFRSL);
+		ts.tv_nsec = gem_readl(bp, PEFRN);
+	} else {
+		/* PTP Event Frame packets */
+		ts.tv_sec = gem_readl(bp, EFRSL);
+		ts.tv_nsec = gem_readl(bp, EFRN);
+	}
+	ns = timespec64_to_ns(&ts);
+
+	memset(shhwtstamps, 0, sizeof(struct skb_shared_hwtstamps));
+	shhwtstamps->hwtstamp = ns_to_ktime(ns);
+}
+
+/* no static, GEM PTP interface functions */
+void gem_ptp_do_txstamp(struct macb *bp, struct sk_buff *skb)
+{
+	if (!bp->hwts_tx_en)
+		return;
+
+	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
+		int class = ptp_classify_raw(skb);
+		int peer;
+
+		peer = gem_get_ptp_peer(skb, class);
+		if (peer < 0)
+			return;
+
+		/* Timestamp this packet */
+		gem_ptp_tx_hwtstamp(bp, skb, peer);
+	}
+}
+
+void gem_ptp_do_rxstamp(struct macb *bp, struct sk_buff *skb)
+{
+	int class, peer;
+
+	if (!bp->hwts_rx_en)
+		return;
+
+	__skb_push(skb, ETH_HLEN);
+	class = ptp_classify_raw(skb);
+	__skb_pull(skb, ETH_HLEN);
+
+	peer = gem_get_ptp_peer(skb, class);
+	if (peer < 0)
+		return;
+
+	gem_ptp_rx_hwtstamp(bp, skb, peer);
+}
+
+void gem_ptp_init(struct net_device *ndev)
+{
+	struct macb *bp = netdev_priv(ndev);
+
+	spin_lock_init(&bp->tsu_clk_lock);
+	bp->ptp_caps = gem_ptp_caps_template;
+
+	/* nominal frequency and maximum adjustment in ppb */
+	bp->tsu_rate = bp->ptp_info->get_tsu_rate(bp);
+	bp->ptp_caps.max_adj = gem_ptp_max_adj(bp->tsu_rate);
+
+	gem_ptp_init_timer(bp);
+
+	bp->ptp_clock = ptp_clock_register(&bp->ptp_caps, NULL);
+	if (IS_ERR(&bp->ptp_clock)) {
+		bp->ptp_clock = NULL;
+		pr_err("ptp clock register failed\n");
+		return;
+	}
+
+	dev_info(&bp->pdev->dev, "%s ptp clock registered.\n",
+		 GEM_PTP_TIMER_NAME);
+}
+
+void gem_ptp_remove(struct net_device *ndev)
+{
+	struct macb *bp = netdev_priv(ndev);
+
+	if (bp->ptp_clock)
+		ptp_clock_unregister(bp->ptp_clock);
+
+	gem_ptp_clear_timer(bp);
+
+	dev_info(&bp->pdev->dev, "%s ptp clock unregistered.\n",
+		 GEM_PTP_TIMER_NAME);
+}
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH net-next v3 1/1] driver: ipvlan: Free ipvl_port directly with kfree instead of kfree_rcu
From: David Miller @ 2016-12-07 18:22 UTC (permalink / raw)
  To: fgao; +Cc: maheshb, edumazet, netdev, gfree.wind
In-Reply-To: <1481071487-11227-1-git-send-email-fgao@ikuai8.com>

From: fgao@ikuai8.com
Date: Wed,  7 Dec 2016 08:44:47 +0800

> From: Gao Feng <fgao@ikuai8.com>
> 
> There are two functions which would free the ipvl_port now. The first
> is ipvlan_port_create. It frees the ipvl_port in the error handler,
> so it could kfree it directly. The second is ipvlan_port_destroy. It
> invokes netdev_rx_handler_unregister which enforces one grace period
> by synchronize_net firstly, so it also could kfree the ipvl_port
> directly and safely.
> 
> So it is unnecessary to use kfree_rcu to free ipvl_port.
> 
> Signed-off-by: Gao Feng <fgao@ikuai8.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 1/1] driver: macvlan: Remove the rcu member of macvlan_port
From: David Miller @ 2016-12-07 18:22 UTC (permalink / raw)
  To: fgao; +Cc: kaber, netdev, gfree.wind
In-Reply-To: <1481084598-17351-1-git-send-email-fgao@ikuai8.com>

From: fgao@ikuai8.com
Date: Wed,  7 Dec 2016 12:23:18 +0800

> From: Gao Feng <fgao@ikuai8.com>
> 
> When free macvlan_port in macvlan_port_destroy, it is safe to free
> directly because netdev_rx_handler_unregister could enforce one
> grace period.
> So it is unnecessary to use kfree_rcu for macvlan_port.
> 
> Signed-off-by: Gao Feng <fgao@ikuai8.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] phy: Don't increment MDIO bus refcount unless it's a different owner
From: David Miller @ 2016-12-07 18:29 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, johan, rmk+kernel, andrew
In-Reply-To: <20161207045443.26246-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Tue,  6 Dec 2016 20:54:43 -0800

> Commit 3e3aaf649416 ("phy: fix mdiobus module safety") fixed the way we
> dealt with MDIO bus module reference count, but sort of introduced a
> regression in that, if an Ethernet driver registers its own MDIO bus
> driver, as is common, we will end up with the Ethernet driver's
> module->refnct set to 1, thus preventing this driver from any removal.
> 
> Fix this by comparing the network device's device driver owner against
> the MDIO bus driver owner, and only if they are different, increment the
> MDIO bus module refcount.
> 
> Fixes: 3e3aaf649416 ("phy: fix mdiobus module safety")
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Applied.

^ permalink raw reply

* Re: [net-next] nlmon: use core MTU range checking in nlmon driver
From: David Miller @ 2016-12-07 18:30 UTC (permalink / raw)
  To: zhangshengju; +Cc: jarod, netdev
In-Reply-To: <1481102765-12021-1-git-send-email-zhangshengju@cmss.chinamobile.com>

From: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Date: Wed,  7 Dec 2016 17:26:05 +0800

> Since commit 61e84623ace3 ("net: centralize net_device min/max MTU checking"),
> mtu range is checked at dev_set_mtu().
> 
> This patch adds min_mtu for nlmon device and remove unnecessary
> ndo_change_mtu() function.
> 
> Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>

Applied.

^ permalink raw reply

* Re: [net-next] dummy: expend mtu range for dummy device
From: David Miller @ 2016-12-07 18:30 UTC (permalink / raw)
  To: zhangshengju; +Cc: jarod, netdev
In-Reply-To: <1481103693-16394-1-git-send-email-zhangshengju@cmss.chinamobile.com>

From: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Date: Wed,  7 Dec 2016 17:41:33 +0800

> After commit 61e84623ace3 ("net: centralize net_device min/max MTU checking"),
> the mtu range for dummy device becomes [68, 1500].
> 
> This patch expends it to [0, 65535].
> 
> Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>

Applied with "extends" typo fixed, thanks.

^ permalink raw reply

* Misalignment, MIPS, and ip_hdr(skb)->version
From: Jason A. Donenfeld @ 2016-12-07 18:35 UTC (permalink / raw)
  To: Netdev, linux-mips; +Cc: LKML, WireGuard mailing list

Hey MIPS Networking People,

I receive encrypted packets with a 13 byte header. I decrypt the
ciphertext in place, and then discard the header. I then pass the
plaintext to the rest of the networking stack. The plaintext is an IP
packet. Due to the 13 byte header that was discarded, the plaintext
possibly begins at an unaligned location (depending on whether
dev->needed_headroom was respected).

Does this matter? Is this bad? Will there be a necessary performance hit?

In order to find out, I instrumented the MIPS unaligned access
exception handler to see where I was actually in trouble.
Surprisingly, the only part of the stack that seemed to be upset was
on calls to ip_hdr(skb)->version.

Two things disturb me about this. First, this seems too good to be
true. Does it seem reasonable to you that this is actually the only
place that would be problematic? Or was my testing methodology wrong
to arrive at such an optimistic conclusion?

Secondly, why should a call to ip_hdr(skb)->version cause an unaligned
access anyway? This struct member is simply the second half of a
single byte in a bit field. I'd expect for the compiler to generate a
single byte load, followed by a bitshift or a mask. Instead, the
compiler appears to generate a double byte load, hence the exception.
What's up with this? Stupid compiler that should be fixed? Some odd
optimization? What to do?

I'm considering just adding an extra byte of padding (see discussion
in [1]), but before I make any decision like that (and hopefully it
won't be necessary), I'd like to completely and entirely understand
the full effects and consequences of calling netif_rx(skb) when
skb->data is unaligned. Any insight you have to offer would be most
welcome.

Thanks,
Jason

[1] https://lists.zx2c4.com/pipermail/wireguard/2016-December/000709.html

^ permalink raw reply

* Re: 4.9.0-rc8: tg3 dead after resume
From: Billy Shuman @ 2016-12-07 18:39 UTC (permalink / raw)
  To: Michael Chan; +Cc: Netdev, Siva Reddy Kallam
In-Reply-To: <CACKFLinTBokAXnOri9_sxuFby0y4VxxrhEOh1aayPsyPWwDahw@mail.gmail.com>

On Wed, Dec 7, 2016 at 12:37 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Wed, Dec 7, 2016 at 7:20 AM, Billy Shuman <wshuman3@gmail.com> wrote:
>> After resume on 4.9.0-rc8 tg3 is dead.
>>
>> In logs I see:
>> kernel: tg3 0000:44:00.0: phy probe failed, err -19
>> kernel: tg3 0000:44:00.0: Problem fetching invariants of chip, aborting
>
> -19 is -ENODEV which means tg3 cannot read the PHY ID.
>
> If it's a true suspend/resume operation, the driver does not have to
> go through probe during resume.  Please explain how you do
> suspend/resume.

I used systemd
>
> Did this work before?  There has been very few changes to tg3 recently.
>
>>
>> rmmod and modprobe does not fix the problem only a reboot resolves the issue.
>>
>> Billy

^ permalink raw reply

* Re: Misalignment, MIPS, and ip_hdr(skb)->version
From: Dave Taht @ 2016-12-07 18:47 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Netdev, linux-mips, LKML, WireGuard mailing list
In-Reply-To: <CAHmME9o_eCNXpVztOZKW55kpRtE+1KSEQTQOjUBVn68Y2+or2g@mail.gmail.com>

The openwrt tree has long contained a set of patches that correct for
unaligned issues throughout the linux network stack.

https://git.lede-project.org/?p=openwrt/source.git;a=blob;f=target/linux/ar71xx/patches-4.4/910-unaligned_access_hacks.patch;h=b4b749e4b9c02a74a9f712a2740d63e554de5c64;hb=ee53a240ac902dc83209008a2671e7fdcf55957a

unaligned access traps in the packet processing path on certain versions of
the mips architecture is horrifically bad. I had kind of hoped these
patches in some form would have made it upstream by now. (or the
arches that have the issue retired, I think it's mostly just mips24k)

^ permalink raw reply

* Re: Misalignment, MIPS, and ip_hdr(skb)->version
From: David Miller @ 2016-12-07 18:51 UTC (permalink / raw)
  To: dave.taht; +Cc: netdev, linux-kernel, wireguard, linux-mips
In-Reply-To: <CAA93jw7hcmkcyD=t4VRrQFfHk+n+EkSVgY6KFDq0_-DGpMADYw@mail.gmail.com>

From: Dave Taht <dave.taht@gmail.com>
Date: Wed, 7 Dec 2016 10:47:16 -0800

> https://git.lede-project.org/?p=openwrt/source.git;a=blob;f=target/linux/ar71xx/patches-4.4/910-unaligned_access_hacks.patch;h=b4b749e4b9c02a74a9f712a2740d63e554de5c64;hb=ee53a240ac902dc83209008a2671e7fdcf55957a

It's so much better to analyze properly where the misalignment comes from
and address it at the source, as we have for various cases that trip up
Sparc too.

Marking structures "packed" is going to kill performance and is not
the answer.

^ permalink raw reply

* Re: 4.9.0-rc8: tg3 dead after resume
From: Billy Shuman @ 2016-12-07 18:44 UTC (permalink / raw)
  To: Michael Chan; +Cc: Netdev, Siva Reddy Kallam
In-Reply-To: <CACKFLinTBokAXnOri9_sxuFby0y4VxxrhEOh1aayPsyPWwDahw@mail.gmail.com>

On Wed, Dec 7, 2016 at 12:37 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Wed, Dec 7, 2016 at 7:20 AM, Billy Shuman <wshuman3@gmail.com> wrote:
>> After resume on 4.9.0-rc8 tg3 is dead.
>>
>> In logs I see:
>> kernel: tg3 0000:44:00.0: phy probe failed, err -19
>> kernel: tg3 0000:44:00.0: Problem fetching invariants of chip, aborting
>
> -19 is -ENODEV which means tg3 cannot read the PHY ID.
>
> If it's a true suspend/resume operation, the driver does not have to
> go through probe during resume.  Please explain how you do
> suspend/resume.
>

Sorry my previous message was accidentally sent to early.

I used systemd (systemctl suspend) to suspend.

> Did this work before?  There has been very few changes to tg3 recently.
>

This is a new laptop for me, but the same behavior is seen on 4.4.36 and 4.8.12.

>>
>> rmmod and modprobe does not fix the problem only a reboot resolves the issue.
>>
>> Billy

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox