Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] tc_act: export all user headers
From: David Miller @ 2016-04-25 20:52 UTC (permalink / raw)
  To: stephen; +Cc: hadi, netdev
In-Reply-To: <20160425.164939.2072274531689250224.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Mon, 25 Apr 2016 16:49:39 -0400 (EDT)

> From: Stephen Hemminger <stephen@networkplumber.org>
> Date: Fri, 22 Apr 2016 10:06:38 -0700
> 
>> The file tc_ife.h was missing from the export list.
>> Rather than continue to cherry-pick, just export all headers in the directory.
>> 
>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Applied.

Please compile test, pretty please??!?!?

./usr/include/linux/tc_act/*.h: No such file or directory

It looks like you can't expect shell expansions like that to work in
Kbuild files.

^ permalink raw reply

* Re: [PATCH v2 net-next 0/2] pskb_extract() helper function.
From: David Miller @ 2016-04-25 20:54 UTC (permalink / raw)
  To: sowmini.varadhan
  Cc: netdev, rds-devel, santosh.shilimkar, eric.dumazet,
	marcelo.leitner
In-Reply-To: <cover.1461368732.git.sowmini.varadhan@oracle.com>

From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Fri, 22 Apr 2016 18:36:34 -0700

> This patchset follows up on the discussion in
>  https://www.mail-archive.com/netdev@vger.kernel.org/msg105090.html
> 
> For RDS-TCP, we have to deal with the full gamut of
> nonlinear sk_buffs, including all the frag_list variants.
> Also, the parent skb has to remain unchanged, while the clone
> is queued for Rx on the PF_RDS socket. 
> 
> Patch 1 of this patchset adds a pskb_extract() function that 
> does all this without the redundant memcpy's in pskb_expand_head() 
> and __pskb_pull_tail().
> 
> v2: Marcelo Leitner review comments

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/6] bus: Add shared MDIO bus framework
From: Andrew Lunn @ 2016-04-25 20:56 UTC (permalink / raw)
  To: Pramod Kumar
  Cc: Rob Herring, Catalin Marinas, Will Deacon, Masahiro Yamada,
	Chen-Yu Tsai, Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Pawel Moll, Arnd Bergmann, Suzuki K Poulose,
	netdev-u79uwXL29TY76Z2rM5mHXA, Punit Agrawal,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, BCM Kernel Feedback,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Anup Patel
In-Reply-To: <1461230323-27891-2-git-send-email-pramod.kumar-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Hi Pramod

I took a closer look. I don't see why the current MDIO code should not
be used, rather than adding a new framework.

What you need for your Non Ethernet PHYs is that they are somehow
probed. The current MDIO code will do that, based on the compatible
string. An mdio device gets passed a struct mdio_device * to its probe
function, giving you the bus and address on the bus for the
device. Your PHY driver can then register itself using
devm_of_phy_provider_register(). The user of the PHY then needs to use
devm_phy_get() to get a handle on the phy, and can then use
phy_power_on()/phy_power_off().

There is a very simple example here for an MDIO device driver:

http://thread.gmane.org/gmane.linux.network/393532

The muxing of the MDIO busses looks a little tricky. At the moment you have:

    writel(cmd, base + MDIO_PARAM_OFFSET);

which mixes together the muxing parameters and the write value. Can
this register be accessed as two 16 bit registers? If it can be, you
can cleanly separate out the muxing.

Take a look at mdio-mux-gpio.c and mdio-mux-mmioreg.c for examples of
MDIO muxes.

     Andrew
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH RFC net-next] net: dsa: Provide CPU port statistics to master netdev
From: Andrew Lunn @ 2016-04-25 21:43 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, davem, vivien.didelot
In-Reply-To: <1461175101-13506-1-git-send-email-f.fainelli@gmail.com>

On Wed, Apr 20, 2016 at 10:58:21AM -0700, Florian Fainelli wrote:
> This patch overloads the DSA master netdev, aka CPU Ethernet MAC to also
> include switch-side statistics, which is useful for debugging purposes,
> when the switch is not properly connected to the Ethernet MAC (duplex
> mismatch, (RG)MII electrical issues etc.).
> 
> We accomplish this by retaining the original copy of the master netdev's
> ethtool_ops, and just overload the 3 operations we care about:
> get_sset_count, get_strings and get_ethtool_stats so as to intercept
> these calls and call into the original master_netdev ethtool_ops, plus
> our own.

Hi Florian

Interesting concept. My one concern is that by concatenating the two
sets of statistics, we get a name clash. I'm not sure the Marvell
switch statistics counters have different names to the Marvell
Ethernet driver statistics counters. ethtool does not care, but maybe
an SNMP agent using these statistics might not be too happy seeing the
same name twice?

     Andrew

^ permalink raw reply

* [PATCH v4 net-next 3/3] tcp: Handle eor bit when fragmenting a skb
From: Martin KaFai Lau @ 2016-04-25 21:44 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
	Willem de Bruijn, Yuchung Cheng, Kernel Team
In-Reply-To: <1461620690-1081063-1-git-send-email-kafai@fb.com>

When fragmenting a skb, the next_skb should carry
the eor from prev_skb.  The eor of prev_skb should
also be reset.

Packetdrill script for testing:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

0.200 sendto(4, ..., 15330, MSG_EOR, ..., ...) = 15330
0.200 sendto(4, ..., 730, 0, ..., ...) = 730

0.200 > .  1:7301(7300) ack 1
0.200 > . 7301:14601(7300) ack 1

0.300 < . 1:1(0) ack 14601 win 257
0.300 > P. 14601:15331(730) ack 1
0.300 > P. 15331:16061(730) ack 1

0.400 < . 1:1(0) ack 16061 win 257
0.400 close(4) = 0
0.400 > F. 16061:16061(0) ack 1
0.400 < F. 1:1(0) ack 16062 win 257
0.400 > . 16062:16062(0) ack 2

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_output.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index fa4d17f..55a926b 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1128,6 +1128,12 @@ static void tcp_fragment_tstamp(struct sk_buff *skb, struct sk_buff *skb2)
 	}
 }
 
+static void tcp_skb_fragment_eor(struct sk_buff *skb, struct sk_buff *skb2)
+{
+	TCP_SKB_CB(skb2)->eor = TCP_SKB_CB(skb)->eor;
+	TCP_SKB_CB(skb)->eor = 0;
+}
+
 /* Function to create two new TCP segments.  Shrinks the given segment
  * to the specified size and appends a new segment with the rest of the
  * packet to the list.  This won't be called frequently, I hope.
@@ -1173,6 +1179,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
 	TCP_SKB_CB(skb)->tcp_flags = flags & ~(TCPHDR_FIN | TCPHDR_PSH);
 	TCP_SKB_CB(buff)->tcp_flags = flags;
 	TCP_SKB_CB(buff)->sacked = TCP_SKB_CB(skb)->sacked;
+	tcp_skb_fragment_eor(skb, buff);
 
 	if (!skb_shinfo(skb)->nr_frags && skb->ip_summed != CHECKSUM_PARTIAL) {
 		/* Copy and checksum data tail into the new buffer. */
@@ -1733,6 +1740,8 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len,
 	/* This packet was never sent out yet, so no SACK bits. */
 	TCP_SKB_CB(buff)->sacked = 0;
 
+	tcp_skb_fragment_eor(skb, buff);
+
 	buff->ip_summed = skb->ip_summed = CHECKSUM_PARTIAL;
 	skb_split(skb, buff, len);
 	tcp_fragment_tstamp(skb, buff);
-- 
2.5.1

^ permalink raw reply related

* [PATCH v4 net-next 0/3] tcp: Make use of MSG_EOR in tcp_sendmsg
From: Martin KaFai Lau @ 2016-04-25 21:44 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
	Willem de Bruijn, Yuchung Cheng, Kernel Team

v4:
~ Do not set eor bit in do_tcp_sendpages() since there is
  no way to pass MSG_EOR from the userland now.
~ Avoid rmw by testing MSG_EOR first in tcp_sendmsg().
~ Move TCP_SKB_CB(skb)->eor test to a new helper
  tcp_skb_can_collapse_to() (suggested by Soheil).
~ Add some packetdrill tests.

v3:
~ Separate EOR marking from the SKBTX_ANY_TSTAMP logic.
~ Move the eor bit test back to the loop in tcp_sendmsg and
  tcp_sendpage because there could be >1 threads doing
  sendmsg.
~ Thanks to Eric Dumazet's suggestions on v2.
~ The TCP timestamp bug fixes are separated into other threads.

v2:
~ Rework based on the recent work
  "add TX timestamping via cmsg" by
  Soheil Hassas Yeganeh <soheil.kdev@gmail.com>
~ This version takes the MSG_EOR bit as a signal of
  end-of-response-message and leave the selective
  timestamping job to the cmsg
~ Changes based on the v1 feedback (like avoid
  unlikely check in a loop and adding tcp_sendpage
  support)
~ The first 3 patches are bug fixes.  The fixes in this
  series depend on the newly introduced txstamp_ack in
  net-next.  I will make relevant patches against net after
  getting some feedback.
~ The test results are based on the recently posted net fix:
  "tcp: Fix SOF_TIMESTAMPING_TX_ACK when handling dup acks"

One potential use case is to use MSG_EOR with
SOF_TIMESTAMPING_TX_ACK to get a more accurate
TCP ack timestamping on application protocol with
multiple outgoing response messages (e.g. HTTP2).

One of our use case is at the webserver.  The webserver tracks
the HTTP2 response latency by measuring when the webserver sends
the first byte to the socket till the TCP ACK of the last byte
is received.  In the cases where we don't have client side
measurement, measuring from the server side is the only option.
In the cases we have the client side measurement, the server side
data can also be used to justify/cross-check-with the client
side data.

^ permalink raw reply

* [PATCH v4 net-next 2/3] tcp: Handle eor bit when coalescing skb
From: Martin KaFai Lau @ 2016-04-25 21:44 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
	Willem de Bruijn, Yuchung Cheng, Kernel Team
In-Reply-To: <1461620690-1081063-1-git-send-email-kafai@fb.com>

This patch:
1. Prevent next_skb from coalescing to the prev_skb if
   TCP_SKB_CB(prev_skb)->eor is set
2. Update the TCP_SKB_CB(prev_skb)->eor if coalescing is
   allowed

Packetdrill script for testing:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
0.200 write(4, ..., 11680) = 11680

0.200 > P. 1:731(730) ack 1
0.200 > P. 731:1461(730) ack 1
0.200 > . 1461:8761(7300) ack 1
0.200 > P. 8761:13141(4380) ack 1

0.300 < . 1:1(0) ack 1 win 257 <sack 1461:13141,nop,nop>
0.300 > P. 1:731(730) ack 1
0.300 > P. 731:1461(730) ack 1
0.400 < . 1:1(0) ack 13141 win 257

0.400 close(4) = 0
0.400 > F. 13141:13141(0) ack 1
0.500 < F. 1:1(0) ack 13142 win 257
0.500 > . 13142:13142(0) ack 2

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_input.c  | 4 ++++
 net/ipv4/tcp_output.c | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index dcad8f9..65fb708 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1303,6 +1303,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
 	}
 
 	TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
+	TCP_SKB_CB(prev)->eor = TCP_SKB_CB(skb)->eor;
 	if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
 		TCP_SKB_CB(prev)->end_seq++;
 
@@ -1368,6 +1369,9 @@ static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb,
 	if ((TCP_SKB_CB(prev)->sacked & TCPCB_TAGBITS) != TCPCB_SACKED_ACKED)
 		goto fallback;
 
+	if (!tcp_skb_can_collapse_to(prev))
+		goto fallback;
+
 	in_sack = !after(start_seq, TCP_SKB_CB(skb)->seq) &&
 		  !before(end_seq, TCP_SKB_CB(skb)->end_seq);
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 9d3b4b3..fa4d17f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
 	 * packet counting does not break.
 	 */
 	TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS;
+	TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor;
 
 	/* changed transmit queue under us so clear hints */
 	tcp_clear_retrans_hints_partial(tp);
@@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
 		if (!tcp_can_collapse(sk, skb))
 			break;
 
+		if (!tcp_skb_can_collapse_to(to))
+			break;
+
 		space -= skb->len;
 
 		if (first) {
-- 
2.5.1

^ permalink raw reply related

* [PATCH v4 net-next 1/3] tcp: Make use of MSG_EOR in tcp_sendmsg
From: Martin KaFai Lau @ 2016-04-25 21:44 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
	Willem de Bruijn, Yuchung Cheng, Kernel Team
In-Reply-To: <1461620690-1081063-1-git-send-email-kafai@fb.com>

This patch adds an eor bit to the TCP_SKB_CB.  When MSG_EOR
is passed to tcp_sendmsg, the eor bit will be set at the skb
containing the last byte of the userland's msg.  The eor bit
will prevent data from appending to that skb in the future.

The change in do_tcp_sendpages is to honor the eor set
during the previous tcp_sendmsg(MSG_EOR) call.

This patch handles the tcp_sendmsg case.  The followup patches
will handle other skb coalescing and fragment cases.

One potential use case is to use MSG_EOR with
SOF_TIMESTAMPING_TX_ACK to get a more accurate
TCP ack timestamping on application protocol with
multiple outgoing response messages (e.g. HTTP2).

Packetdrill script for testing:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0

0.200 write(4, ..., 14600) = 14600
0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730

0.200 > .  1:7301(7300) ack 1
0.200 > P. 7301:14601(7300) ack 1

0.300 < . 1:1(0) ack 14601 win 257
0.300 > P. 14601:15331(730) ack 1
0.300 > P. 15331:16061(730) ack 1

0.400 < . 1:1(0) ack 16061 win 257
0.400 close(4) = 0
0.400 > F. 16061:16061(0) ack 1
0.400 < F. 1:1(0) ack 16062 win 257
0.400 > . 16062:16062(0) ack 2

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h | 8 +++++++-
 net/ipv4/tcp.c    | 7 +++++--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7f2553d..ce08038 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -762,7 +762,8 @@ struct tcp_skb_cb {
 
 	__u8		ip_dsfield;	/* IPv4 tos or IPv6 dsfield	*/
 	__u8		txstamp_ack:1,	/* Record TX timestamp for ack? */
-			unused:7;
+			eor:1,		/* Is skb MSG_EOR marked? */
+			unused:6;
 	__u32		ack_seq;	/* Sequence number ACK'd	*/
 	union {
 		struct inet_skb_parm	h4;
@@ -809,6 +810,11 @@ static inline int tcp_skb_mss(const struct sk_buff *skb)
 	return TCP_SKB_CB(skb)->tcp_gso_size;
 }
 
+static inline bool tcp_skb_can_collapse_to(const struct sk_buff *skb)
+{
+	return likely(!TCP_SKB_CB(skb)->eor);
+}
+
 /* Events passed to congestion control interface */
 enum tcp_ca_event {
 	CA_EVENT_TX_START,	/* first transmit when no packets in flight */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4d73858..ea5364b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -908,7 +908,8 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 		int copy, i;
 		bool can_coalesce;
 
-		if (!tcp_send_head(sk) || (copy = size_goal - skb->len) <= 0) {
+		if (!tcp_send_head(sk) || (copy = size_goal - skb->len) <= 0 ||
+		    !tcp_skb_can_collapse_to(skb)) {
 new_segment:
 			if (!sk_stream_memory_free(sk))
 				goto wait_for_sndbuf;
@@ -1156,7 +1157,7 @@ int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 			copy = max - skb->len;
 		}
 
-		if (copy <= 0) {
+		if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
 new_segment:
 			/* Allocate new segment. If the interface is SG,
 			 * allocate skb fitting to single page.
@@ -1250,6 +1251,8 @@ new_segment:
 		copied += copy;
 		if (!msg_data_left(msg)) {
 			tcp_tx_timestamp(sk, sockc.tsflags, skb);
+			if (unlikely(flags & MSG_EOR))
+				TCP_SKB_CB(skb)->eor = 1;
 			goto out;
 		}
 
-- 
2.5.1

^ permalink raw reply related

* Re: [PATCH net v2 2/3] drivers: net: cpsw: fix error messages when using phy-handle DT property
From: David Rivshin (Allworx) @ 2016-04-25 21:55 UTC (permalink / raw)
  To: Grygorii Strashko, Mugunthan V N
  Cc: Rob Herring, netdev, linux-omap, linux-arm-kernel, devicetree,
	linux-kernel, David Miller, Andrew Goodbody, Markus Brunner,
	Nicolas Chauvet
In-Reply-To: <571E6C14.8060007@ti.com>

On Mon, 25 Apr 2016 22:12:20 +0300
Grygorii Strashko <grygorii.strashko@ti.com> wrote:

> On 04/22/2016 06:45 PM, David Rivshin (Allworx) wrote:
> > On Fri, 22 Apr 2016 16:03:34 +0300
> > Grygorii Strashko <grygorii.strashko@ti.com> wrote:
> >   
> >> On 04/21/2016 09:26 PM, David Rivshin (Allworx) wrote:  
> >>> From: David Rivshin <drivshin@allworx.com>
> >>>
> >>> The phy-handle, phy_id, and fixed-link properties are mutually exclusive,
> >>> and only one need be specified. However if phy-handle was specified, an
> >>> error message would complain about the lack of phy_id or fixed-link.  
> 
> I think, commit message need to be updated.
> You not only fix log messages - you also fix the issue with 
> of_get_phy_mode(slave_node); which will not be called if phy-handle is used.

You are correct, and that is probably the more important fix compared
to the error messages.

Because the content is becoming less coherent, what I may do is split 
this patch into 3 small patches:
A) devicetree binding documentation changes
B) cpsw_probe_dt changes, with the fixes for of_get_phy_mode() and
   related error message
C) cpsw_slave_open changes, with the fixes for crash if of_phy_connect
   returns NULL, and related error message 

Does that sound reasonable?

>  
> 
> slave_data->phy_if = of_get_phy_mode(slave_node); 
> ^ see below
> >>>
> >>> Also, if phy-handle was specified and the subsequent of_phy_connect()
> >>> failed, the error message still referenced slaved->data->phy_id, which
> >>> would be empty. Instead, use the name of the device_node as a useful
> >>> identifier.
> >>>
> >>> Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing")
> >>> Signed-off-by: David Rivshin <drivshin@allworx.com>
> >>> Acked-by: Rob Herring <robh@kernel.org>
> >>> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
> >>> ---
> >>> If would like this for -stable it should apply cleanly as far back
> >>> as 4.5. It failes on 4.4 due to some context differences, but can be
> >>> applied with 'git am -C2'. Or, I can produce a separate patch against
> >>> linux-4.4.y if preferred.
> >>>
> >>> Changes since v1 [1]:
> >>> - Rebased (no conflicts)
> >>> - Added Tested-by from Nicolas Chauvet
> >>> - Added Acked-by from Rob Herring for the binding change
> >>>
> >>> [1] https://patchwork.ozlabs.org/patch/560324/
> >>>
> >>>    Documentation/devicetree/bindings/net/cpsw.txt |  4 ++--
> >>>    drivers/net/ethernet/ti/cpsw.c                 | 17 +++++++++++++----
> >>>    2 files changed, 15 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/Documentation/devicetree/bindings/net/cpsw.txt b/Documentation/devicetree/bindings/net/cpsw.txt
> >>> index 28a4781..3033c0f 100644
> >>> --- a/Documentation/devicetree/bindings/net/cpsw.txt
> >>> +++ b/Documentation/devicetree/bindings/net/cpsw.txt
> >>> @@ -46,16 +46,16 @@ Optional properties:
> >>>    - dual_emac_res_vlan	: Specifies VID to be used to segregate the ports
> >>>    - mac-address		: See ethernet.txt file in the same directory
> >>>    - phy_id		: Specifies slave phy id  
> >>
> >> May be the "phy_id" can be marked as deprecated? (while here)
> >> The recommended property now is "phy-handle".  
> > 
> > I can certainly do that. Perhaps something like this?
> >   - phy_id		: Specifies slave phy id (deprecated, use phy-handle)
> > 
> > Rob, would you have any issues with bundling that?
> >   
> >>  
> >>>    - phy-handle		: See ethernet.txt file in the same directory
> >>>    
> >>>    Slave sub-nodes:
> >>>    - fixed-link		: See fixed-link.txt file in the same directory
> >>> -			  Either the property phy_id, or the sub-node
> >>> -			  fixed-link can be specified
> >>> +
> >>> +Note: Exactly one of phy_id, phy-handle, or fixed-link must be specified.
> >>>    
> >>>    Note: "ti,hwmods" field is used to fetch the base address and irq
> >>>    resources from TI, omap hwmod data base during device registration.
> >>>    Future plan is to migrate hwmod data base contents into device tree
> >>>    blob so that, all the required data will be used from device tree dts
> >>>    file.
> >>>    
> >>> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> >>> index d69cb3f..3c81413 100644
> >>> --- a/drivers/net/ethernet/ti/cpsw.c
> >>> +++ b/drivers/net/ethernet/ti/cpsw.c
> >>> @@ -1150,16 +1150,19 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv)
> >>>    	if (slave->data->phy_node)
> >>>    		slave->phy = of_phy_connect(priv->ndev, slave->data->phy_node,
> >>>    				 &cpsw_adjust_link, 0, slave->data->phy_if);
> >>>    	else
> >>>    		slave->phy = phy_connect(priv->ndev, slave->data->phy_id,
> >>>    				 &cpsw_adjust_link, slave->data->phy_if);
> >>>    	if (IS_ERR(slave->phy)) {
> >>> -		dev_err(priv->dev, "phy %s not found on slave %d\n",
> >>> -			slave->data->phy_id, slave->slave_num);
> >>> +		dev_err(priv->dev, "phy \"%s\" not found on slave %d\n",
> >>> +			slave->data->phy_node ?
> >>> +				slave->data->phy_node->full_name :
> >>> +				slave->data->phy_id,
> >>> +			slave->slave_num);  
> >>
> >> Unfortunately,  there are some inconsistency between legacy and FDT API :(
> >> of_phy_connect() will return valid phy_device or NULL, but phy_connect()
> >> can return valid phy_device or ERR_PTR().  
> > 
> > Good catch, I hadn't noticed that. It looks like that's actually a more
> > serious (pre-existing) bug: if of_phy_connect() returns NULL, we'd end
> > up dereferencing it and pagefaulting.
> > 
> > How about moving the IS_ERR() check into the phy_connect() case like this:
> > 	if (slave->data->phy_node) {
> > 		slave->phy = of_phy_connect(priv->ndev, slave->data->phy_node,
> > 				 &cpsw_adjust_link, 0, slave->data->phy_if);  
> 
> [1]
> 
> > 	} else {
> > 		slave->phy = phy_connect(priv->ndev, slave->data->phy_id,
> > 				 &cpsw_adjust_link, slave->data->phy_if);
> > 		if (IS_ERR(slave->phy))
> > 			slave->phy = NULL;  
> [2]
> > 	}
> > 	if (!slave->phy) {
> > 		dev_err(priv->dev, "phy \"%s\" not found on slave %d\n",
> > 			slave->data->phy_node ?
> > 				slave->data->phy_node->full_name :
> > 				slave->data->phy_id,
> > 			slave->slave_num);
> > 	} else {
> > 
> > Since you say the phy_id case is deprecated anyways, I'm not too concerned
> > about not printing the error code returned by phy_connect() in that case
> > (especially since it never did so in the past). That lets us still avoid
> > duplicating the dev_err() itself.  
> 
> I'm not worry too much about duplicating dev_err() - it's always good to know
> the reason of failure.
> 
> So, may be for of_phy_connect() [1]:
>  dev_err(priv->dev, "phy \"%s\" not found on slave %d\n",
> 	slave->data->phy_node->full_name,
>  	slave->slave_num);
> 
> and for phy_connect() [2]:
>   dev_err(priv->dev, "phy %s not found on slave %d, err %d\n",
>   	slave->data->phy_id, slave->slave_num, PTR_ERR(slave->phy));
> 
> Mugunthan, any comments?

If that's the preference, then I can incorporate that into V3.

> 
> > 
> >   
> >>
> >>
> >>  
> >>>    		slave->phy = NULL;
> >>>    	} else {
> >>>    		phy_attached_info(slave->phy);
> >>>    
> >>>    		phy_start(slave->phy);
> >>>    
> >>>    		/* Configure GMII_SEL register */
> >>> @@ -2030,15 +2033,19 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> >>>    		/* This is no slave child node, continue */
> >>>    		if (strcmp(slave_node->name, "slave"))
> >>>    			continue;
> >>>    
> >>>    		slave_data->phy_node = of_parse_phandle(slave_node,
> >>>    							"phy-handle", 0);
> >>>    		parp = of_get_property(slave_node, "phy_id", &lenp);
> >>> -		if (of_phy_is_fixed_link(slave_node)) {
> >>> +		if (slave_data->phy_node) {
> >>> +			dev_dbg(&pdev->dev,
> >>> +				"slave[%d] using phy-handle=\"%s\"\n",
> >>> +				i, slave_data->phy_node->full_name);
> >>> +		} else if (of_phy_is_fixed_link(slave_node)) {
> >>>    			struct device_node *phy_node;
> >>>    			struct phy_device *phy_dev;
> >>>    
> >>>    			/* In the case of a fixed PHY, the DT node associated
> >>>    			 * to the PHY is the Ethernet MAC DT node.
> >>>    			 */
> >>>    			ret = of_phy_register_fixed_link(slave_node);
> >>> @@ -2067,15 +2074,17 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> >>>    			if (!mdio) {
> >>>    				dev_err(&pdev->dev, "Missing mdio platform device\n");
> >>>    				return -EINVAL;
> >>>    			}
> >>>    			snprintf(slave_data->phy_id, sizeof(slave_data->phy_id),
> >>>    				 PHY_ID_FMT, mdio->name, phyid);
> >>>    		} else {
> >>> -			dev_err(&pdev->dev, "No slave[%d] phy_id or fixed-link property\n", i);
> >>> +			dev_err(&pdev->dev,
> >>> +				"No slave[%d] phy_id, phy-handle, or fixed-link property\n",
> >>> +				i);
> >>>    			goto no_phy_slave;
> >>>    		}
> >>>    		slave_data->phy_if = of_get_phy_mode(slave_node);  
> 
> Your change will allow the code to reach this point in case of phy-handle.
> 
> >>>    		if (slave_data->phy_if < 0) {
> >>>    			dev_err(&pdev->dev, "Missing or malformed slave[%d] phy-mode property\n",
> >>>    				i);
> >>>    			return slave_data->phy_if;
> >>>      
> >>
> >>  
> 
> 

^ permalink raw reply

* Re: [PATCH] net: ipv6: Delete host routes on an ifdown
From: David Ahern @ 2016-04-25 22:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, mmanning
In-Reply-To: <20160425.164227.1599148827995063295.davem@davemloft.net>

On 4/25/16 2:42 PM, David Miller wrote:
> From: David Ahern <dsa@cumulusnetworks.com>
> Date: Mon, 25 Apr 2016 13:40:26 -0600
>
>> It's unfortunate you want to take that action. Last week I came across
>> a prior attempt by Stephen to do this same thing -- keep IPv6
>> addresses. That prior attempt was reverted by commit
>> 73a8bd74e261. Cumulus, Brocade, and others clearly want this
>> capability.
>
> But nobody has implemented it correctly, it doesn't matter who wants
> the feature.  That's why it keeps getting reverted.
>
> Also, this testing you are talking about should have happened long
> before you submitted that first patch that introduced all of these
> regressions.  My observations tell me that the bulk of the testing
> happened afterwards and that's why all the regressions are popping up
> now.
>

My testing when submitting the patch was host level: Add an address, 
while(1) (link up, link down), delete an address, etc.

Once it was committed to our kernel it started getting hit with a range 
of L3 deployment scenarios with many nodes and networking config files 
are uploaded and jumped between on real switch hardware - no reboot but 
'networking reload' on the fly. Jumping between different deployments 
with different sets addresses, routes, vrf devices, bridges, bonds, etc.

Your objection seems to be 'all these regressions' but beyond the ref 
count from Andrey all of the bug reports have come from me with 1 from 
Mike, another invested party wanting this to happen. I am the one who 
spent the hours dealing with the kernel panics. My patch, my bug, my 
time wasted coming up with the delta patch. Rather than focusing on my 
mistakes, why not see the commitment on following through with this change?

^ permalink raw reply

* [PATCH] net: dsa: mv88e6xxx: fix uninitialized error return
From: Colin King @ 2016-04-25 22:11 UTC (permalink / raw)
  To: David S . Miller, Vivien Didelot, Andrew Lunn, netdev; +Cc: linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The error return err is not initialized and there is a possibility
that err is not assigned causing mv88e6xxx_port_bridge_join to
return a garbage error return status. Fix this by initializing err
to 0.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/dsa/mv88e6xxx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 028f92f..98d3cfb 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2207,7 +2207,7 @@ int mv88e6xxx_port_bridge_join(struct dsa_switch *ds, int port,
 			       struct net_device *bridge)
 {
 	struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-	int i, err;
+	int i, err = 0;

 	mutex_lock(&ps->smi_mutex);

-- 
2.7.4

^ permalink raw reply related

* [PATCH 1/2 net] lan78xx: fix statistics counter error
From: Woojung.Huh @ 2016-04-25 22:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, UNGLinuxDriver

From: Woojung Huh <woojung.huh@microchip.com>

Fix rx_bytes, tx_bytes and tx_frames error in netdev.stats.
- rx_bytes counted bytes excluding size of struct ethhdr.
- tx_packets didn't count multiple packets in a single urb
- tx_bytes included 8 bytes of extra commands.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
---
 drivers/net/usb/lan78xx.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index f20890e..0460b81 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -269,6 +269,7 @@ struct skb_data {		/* skb->cb is one of these */
 	struct lan78xx_net *dev;
 	enum skb_state state;
 	size_t length;
+	int num_of_packet;
 };
 
 struct usb_context {
@@ -2464,7 +2465,7 @@ static void tx_complete(struct urb *urb)
 	struct lan78xx_net *dev = entry->dev;
 
 	if (urb->status == 0) {
-		dev->net->stats.tx_packets++;
+		dev->net->stats.tx_packets += entry->num_of_packet;
 		dev->net->stats.tx_bytes += entry->length;
 	} else {
 		dev->net->stats.tx_errors++;
@@ -2681,10 +2682,11 @@ void lan78xx_skb_return(struct lan78xx_net *dev, struct sk_buff *skb)
 		return;
 	}
 
-	skb->protocol = eth_type_trans(skb, dev->net);
 	dev->net->stats.rx_packets++;
 	dev->net->stats.rx_bytes += skb->len;
 
+	skb->protocol = eth_type_trans(skb, dev->net);
+
 	netif_dbg(dev, rx_status, dev->net, "< rx, len %zu, type 0x%x\n",
 		  skb->len + sizeof(struct ethhdr), skb->protocol);
 	memset(skb->cb, 0, sizeof(struct skb_data));
@@ -2934,13 +2936,16 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
 
 	skb_totallen = 0;
 	pkt_cnt = 0;
+	count = 0;
+	length = 0;
 	for (skb = tqp->next; pkt_cnt < tqp->qlen; skb = skb->next) {
 		if (skb_is_gso(skb)) {
 			if (pkt_cnt) {
 				/* handle previous packets first */
 				break;
 			}
-			length = skb->len;
+			count = 1;
+			length = skb->len - TX_OVERHEAD;
 			skb2 = skb_dequeue(tqp);
 			goto gso_skb;
 		}
@@ -2961,14 +2966,13 @@ static void lan78xx_tx_bh(struct lan78xx_net *dev)
 	for (count = pos = 0; count < pkt_cnt; count++) {
 		skb2 = skb_dequeue(tqp);
 		if (skb2) {
+			length += (skb2->len - TX_OVERHEAD);
 			memcpy(skb->data + pos, skb2->data, skb2->len);
 			pos += roundup(skb2->len, sizeof(u32));
 			dev_kfree_skb(skb2);
 		}
 	}
 
-	length = skb_totallen;
-
 gso_skb:
 	urb = usb_alloc_urb(0, GFP_ATOMIC);
 	if (!urb) {
@@ -2980,6 +2984,7 @@ gso_skb:
 	entry->urb = urb;
 	entry->dev = dev;
 	entry->length = length;
+	entry->num_of_packet = count;
 
 	spin_lock_irqsave(&dev->txq.lock, flags);
 	ret = usb_autopm_get_interface_async(dev->intf);
-- 
2.8.1

^ permalink raw reply related

* [PATCH 0/2 net] lan78xx: patch series
From: Woojung.Huh @ 2016-04-25 22:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, UNGLinuxDriver

From: Woojung Huh <woojung.huh@microchip.com>

Woojung Huh (2):
 lan78xx: fix statistics counter error
 lan78xx: workaround of forced 100 Full/Half duplex mode error

 drivers/net/usb/lan78xx.c | 44 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 6 deletions(-)

-- 
2.8.1

^ permalink raw reply

* [PATCH 2/2 net] lan78xx: workaround of forced 100 Full/Half duplex mode error
From: Woojung.Huh @ 2016-04-25 22:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, UNGLinuxDriver

From: Woojung Huh <woojung.huh@microchip.com>

At forced 100 Full & Half duplex mode, chip may fail to set mode correctly
when cable is switched between long(~50+m) and short one.
As workaround, set to 10 before setting to 100 at forced 100 F/H mode.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
---
 drivers/net/usb/lan78xx.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 0460b81..f64778a 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1804,7 +1804,34 @@ static void lan78xx_remove_mdio(struct lan78xx_net *dev)
 
 static void lan78xx_link_status_change(struct net_device *net)
 {
-	/* nothing to do */
+	struct phy_device *phydev = net->phydev;
+	int ret, temp;
+
+	/* At forced 100 F/H mode, chip may fail to set mode correctly
+	 * when cable is switched between long(~50+m) and short one.
+	 * As workaround, set to 10 before setting to 100
+	 * at forced 100 F/H mode.
+	 */
+	if (!phydev->autoneg && (phydev->speed == 100)) {
+		/* disable phy interrupt */
+		temp = phy_read(phydev, LAN88XX_INT_MASK);
+		temp &= ~LAN88XX_INT_MASK_MDINTPIN_EN_;
+		ret = phy_write(phydev, LAN88XX_INT_MASK, temp);
+
+		temp = phy_read(phydev, MII_BMCR);
+		temp &= ~(BMCR_SPEED100 | BMCR_SPEED1000);
+		phy_write(phydev, MII_BMCR, temp); /* set to 10 first */
+		temp |= BMCR_SPEED100;
+		phy_write(phydev, MII_BMCR, temp); /* set to 100 later */
+
+		/* clear pending interrupt generated while workaround */
+		temp = phy_read(phydev, LAN88XX_INT_STS);
+
+		/* enable phy interrupt back */
+		temp = phy_read(phydev, LAN88XX_INT_MASK);
+		temp |= LAN88XX_INT_MASK_MDINTPIN_EN_;
+		ret = phy_write(phydev, LAN88XX_INT_MASK, temp);
+	}
 }
 
 static int lan78xx_phy_init(struct lan78xx_net *dev)
-- 
2.8.1

^ permalink raw reply related

* Re: [PATCH] net: ipv6: Delete host routes on an ifdown
From: Roopa Prabhu @ 2016-04-25 22:30 UTC (permalink / raw)
  To: David Miller; +Cc: dsa, netdev, mmanning
In-Reply-To: <20160425.164227.1599148827995063295.davem@davemloft.net>

On 4/25/16, 1:42 PM, David Miller wrote:
> From: David Ahern <dsa@cumulusnetworks.com>
> Date: Mon, 25 Apr 2016 13:40:26 -0600
>
>> It's unfortunate you want to take that action. Last week I came across
>> a prior attempt by Stephen to do this same thing -- keep IPv6
>> addresses. That prior attempt was reverted by commit
>> 73a8bd74e261. Cumulus, Brocade, and others clearly want this
>> capability.
> But nobody has implemented it correctly, it doesn't matter who wants
> the feature.  That's why it keeps getting reverted.
>
> Also, this testing you are talking about should have happened long
> before you submitted that first patch that introduced all of these
> regressions.  My observations tell me that the bulk of the testing
> happened afterwards and that's why all the regressions are popping up
> now.
sorry if it seems that way. But we have been testing several versions of this patch
internally. davidA has been throwing it at all of our internal tests just to make sure
it gets all the testing it needs before 4.6 goes out. This last fix was something
that I think got introduced in one of the later versions during re-implementing
bits of it based on feedback. And one of our new recent tests under stress
caught it and we rushed the fix out.

thanks,
Roopa

^ permalink raw reply

* Re: [PATCH V3] net: stmmac: socfpga: Remove re-registration of reset controller
From: Marek Vasut @ 2016-04-25 22:55 UTC (permalink / raw)
  To: Joachim Eastwood
  Cc: netdev, peppe.cavallaro, alexandre.torgue, Matthew Gerlach,
	Dinh Nguyen, David S . Miller
In-Reply-To: <CAGhQ9VzpZcP23DC0LZ95YggtsBDsUCe-NHEtna7hY9Yvdr=ZGg@mail.gmail.com>

On 04/25/2016 08:11 PM, Joachim Eastwood wrote:
> Hi Marek,

Hi!

> On 21 April 2016 at 14:11, Marek Vasut <marex@denx.de> wrote:
>> Both socfpga_dwmac_parse_data() in dwmac-socfpga.c and stmmac_dvr_probe()
>> in stmmac_main.c functions call devm_reset_control_get() to register an
>> reset controller for the stmmac. This results in an attempt to register
>> two reset controllers for the same non-shared reset line.
>>
>> The first attempt to register the reset controller works fine. The second
>> attempt fails with warning from the reset controller core, see below.
>> The warning is produced because the reset line is non-shared and thus
>> it is allowed to have only up-to one reset controller associated with
>> that reset line, not two or more.
>>
>> The solution has multiple parts. First, the original socfpga_dwmac_init()
>> is tweaked to use reset controller pointer from the stmmac_priv (private
>> data of the stmmac core) instead of the local instance, which was used
>> before. The local re-registration of the reset controller is removed.
>>
>> Next, the socfpga_dwmac_init() is moved after stmmac_dvr_probe() in the
>> probe function. This order is legal according to Altera and it makes the
>> code much easier, since there is no need to temporarily register and
>> unregister the reset controller ; the reset controller is already registered
>> by the stmmac_dvr_probe().
>>
>> Finally, plat_dat->exit and socfpga_dwmac_exit() is no longer necessary,
>> since the functionality is already performed by the stmmac core.
> 
> I am trying to rebase my changes on top of your two patches and
> noticed a couple of things.
> 
>>  static int socfpga_dwmac_init(struct platform_device *pdev, void *priv)
>>  {
>> -       struct socfpga_dwmac    *dwmac = priv;
>> +       struct socfpga_dwmac *dwmac = priv;
>>         struct net_device *ndev = platform_get_drvdata(pdev);
>>         struct stmmac_priv *stpriv = NULL;
>>         int ret = 0;
>>
>> -       if (ndev)
>> -               stpriv = netdev_priv(ndev);
>> +       if (!ndev)
>> +               return -EINVAL;
> 
> ndev can never be NULL here. socfpga_dwmac_init() is only called if
> stmmac_dvr_probe() succeeds or we are running the resume callback. So
> I don't see how this could ever be NULL.

That's a good point, this check can indeed be removed. While you're at
the patching, can you remove this one ?

>> +
>> +       stpriv = netdev_priv(ndev);
> 
> It's not really nice to access 'stmmac_priv' as it should be private
> to the core driver, but I don't see any other good solution right now.

I guess some stmmac_reset_assert() wrapper would be nicer, yes. What do
you think ?

>> +       if (!stpriv)
>> +               return -EINVAL;
>>
>>         /* Assert reset to the enet controller before changing the phy mode */
>> -       if (dwmac->stmmac_rst)
>> -               reset_control_assert(dwmac->stmmac_rst);
>> +       if (stpriv->stmmac_rst)
>> +               reset_control_assert(stpriv->stmmac_rst);
>>
>>         /* Setup the phy mode in the system manager registers according to
>>          * devicetree configuration
>> @@ -227,8 +210,8 @@ static int socfpga_dwmac_init(struct platform_device *pdev, void *priv)
>>         /* Deassert reset for the phy configuration to be sampled by
>>          * the enet controller, and operation to start in requested mode
>>          */
>> -       if (dwmac->stmmac_rst)
>> -               reset_control_deassert(dwmac->stmmac_rst);
>> +       if (stpriv->stmmac_rst)
>> +               reset_control_deassert(stpriv->stmmac_rst);
>>
>>         /* Before the enet controller is suspended, the phy is suspended.
>>          * This causes the phy clock to be gated. The enet controller is
>> @@ -245,7 +228,7 @@ static int socfpga_dwmac_init(struct platform_device *pdev, void *priv)
>>          * control register 0, and can be modified by the phy driver
>>          * framework.
>>          */
>> -       if (stpriv && stpriv->phydev)
>> +       if (stpriv->phydev)
>>                 phy_resume(stpriv->phydev);
> 
> Before this change phy_resume() was only called during driver resume
> when , but your patches cause phy_resume() to called at probe time as
> well. Is this okey?

I _hope_ it's OK. The cryptic comment above is not very helpful in this
aspect. Dinh ? :)

> regards,
> Joachim Eastwood
> 

btw I wish you reviewed my patch a bit earlier to catch these bits.

-- 
Best regards,
Marek Vasut

^ permalink raw reply

* Re: [PATCH v4 net-next 1/3] tcp: Make use of MSG_EOR in tcp_sendmsg
From: Eric Dumazet @ 2016-04-25 23:02 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
	Willem de Bruijn, Yuchung Cheng, Kernel Team
In-Reply-To: <1461620690-1081063-2-git-send-email-kafai@fb.com>

On Mon, 2016-04-25 at 14:44 -0700, Martin KaFai Lau wrote:
> This patch adds an eor bit to the TCP_SKB_CB.  When MSG_EOR
> is passed to tcp_sendmsg, the eor bit will be set at the skb
> containing the last byte of the userland's msg.  The eor bit
> will prevent data from appending to that skb in the future.

Acked-by: Eric Dumazet <edumazet@google.com>

Thanks !

^ permalink raw reply

* Re: [PATCH v4 net-next 2/3] tcp: Handle eor bit when coalescing skb
From: Eric Dumazet @ 2016-04-25 23:03 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
	Willem de Bruijn, Yuchung Cheng, Kernel Team
In-Reply-To: <1461620690-1081063-3-git-send-email-kafai@fb.com>

On Mon, 2016-04-25 at 14:44 -0700, Martin KaFai Lau wrote:
> This patch:
> 1. Prevent next_skb from coalescing to the prev_skb if
>    TCP_SKB_CB(prev_skb)->eor is set
> 2. Update the TCP_SKB_CB(prev_skb)->eor if coalescing is
>    allowed

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH] devlink: export header
From: Stephen Hemminger @ 2016-04-25 23:03 UTC (permalink / raw)
  To: David Miller; +Cc: jiri, netdev
In-Reply-To: <20160425.164928.589496341257465225.davem@davemloft.net>

On Mon, 25 Apr 2016 16:49:28 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Stephen Hemminger <stephen@networkplumber.org>
> Date: Fri, 22 Apr 2016 09:55:17 -0700
> 
> > Export devlink.h when doing make headers install.
> > I am going to investigate just doing all headers in the directory,
> > but lets add missing piece for now.
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> This is already exported in the 'net' tree.

ok, thanks

^ permalink raw reply

* Re: [PATCH net-next] tc_act: export all user headers
From: Stephen Hemminger @ 2016-04-25 23:05 UTC (permalink / raw)
  To: David Miller; +Cc: hadi, netdev
In-Reply-To: <20160425.165244.795853255564768080.davem@davemloft.net>

On Mon, 25 Apr 2016 16:52:44 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: David Miller <davem@davemloft.net>
> Date: Mon, 25 Apr 2016 16:49:39 -0400 (EDT)
> 
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Date: Fri, 22 Apr 2016 10:06:38 -0700
> > 
> >> The file tc_ife.h was missing from the export list.
> >> Rather than continue to cherry-pick, just export all headers in the directory.
> >> 
> >> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > 
> > Applied.
> 
> Please compile test, pretty please??!?!?
> 
> ./usr/include/linux/tc_act/*.h: No such file or directory
> 
> It looks like you can't expect shell expansions like that to work in
> Kbuild files.

It worked for me but was testing the export function, not the build.

^ permalink raw reply

* Re: [PATCH v4 net-next 3/3] tcp: Handle eor bit when fragmenting a skb
From: Eric Dumazet @ 2016-04-25 23:04 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
	Willem de Bruijn, Yuchung Cheng, Kernel Team
In-Reply-To: <1461620690-1081063-4-git-send-email-kafai@fb.com>

On Mon, 2016-04-25 at 14:44 -0700, Martin KaFai Lau wrote:
> When fragmenting a skb, the next_skb should carry
> the eor from prev_skb.  The eor of prev_skb should
> also be reset.


Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* System Administrator.
From: Bell, Jerry J @ 2016-04-25 23:05 UTC (permalink / raw)
  To: Bell, Jerry J
In-Reply-To: <E39D0AF106FDED4B8A87976B1603CBAC080C68871F@EXPRODMB04.AD.HISD.ORG>

Suspicious sign in detected on your Mailbox Account, We noticed a recent login attempt from an unusual device or location. If this wasn’t you, please secure your account Now; CLICK HERE<http://ema3600.sitey.me/>

System Administrator.

^ permalink raw reply

* Re: [PATCH net-next 2/6] atl1c: remove private tx lock
From: Francois Romieu @ 2016-04-25 23:16 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev, linux-kernel, Jay Cliburn, Chris Snook
In-Reply-To: <20160425154339.GA17538@breakpoint.cc>

Florian Westphal <fw@strlen.de> :
> Francois Romieu <romieu@fr.zoreil.com> wrote:
[...]
> > Play it safe and keep the implicit local_irq_{save / restore} call ?
> > 
> > It may not be needed but it will help avoiding any unexpected regression
> > report pointing at the NETDEV_TX_LOCKED removal change.
> 
> I thought about that but it doesn't prevent the irq handler from
> running on another CPU, so leaving it around seemed like cargo culting
> to me...

I don't mind removing it in a different patch at all. I'd rather see
the commit history underline that it's unrelated to whatever
NETDEV_TX_LOCKED / LLTX change.

> I don't have an atl1c, but the atl1e in my laptop seems to work fine
> with the (similar) change.
> 
> If you disagree I can respin with local_irq_save of course, but, if
> 'playing it safe' is main goal then its simpler to convert
> spin_trylock_irqsave to spin_lock_irqsave.

Your call, really.

-- 
Ueimor

^ permalink raw reply

* Quote Request
From: Al Waleed Co. @ 2016-04-25 20:09 UTC (permalink / raw)
  To: netdev

Hi,

My name is Al Waleed From Al Waleed trading company Dubai we got you recommendation from one of your customer, so we decided to order a product from you.

Kindly get back to us if you can ship to us in Dubai so that we can get back to you with our products needed from you and other requirement Looking forward hearing back from you

Thank you,

Al Waleed trading company
Customer Service

^ permalink raw reply

* Re: [PATCH v4 net-next 1/3] tcp: Make use of MSG_EOR in tcp_sendmsg
From: Soheil Hassas Yeganeh @ 2016-04-26  0:48 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, Eric Dumazet, Neal Cardwell, Willem de Bruijn,
	Yuchung Cheng, Kernel Team
In-Reply-To: <1461620690-1081063-2-git-send-email-kafai@fb.com>

On Mon, Apr 25, 2016 at 5:44 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> This patch adds an eor bit to the TCP_SKB_CB.  When MSG_EOR
> is passed to tcp_sendmsg, the eor bit will be set at the skb
> containing the last byte of the userland's msg.  The eor bit
> will prevent data from appending to that skb in the future.
>
> The change in do_tcp_sendpages is to honor the eor set
> during the previous tcp_sendmsg(MSG_EOR) call.
>
> This patch handles the tcp_sendmsg case.  The followup patches
> will handle other skb coalescing and fragment cases.
>
> One potential use case is to use MSG_EOR with
> SOF_TIMESTAMPING_TX_ACK to get a more accurate
> TCP ack timestamping on application protocol with
> multiple outgoing response messages (e.g. HTTP2).
>
> Packetdrill script for testing:
> ~~~~~~
> +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
> +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
> +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
> +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> +0 bind(3, ..., ...) = 0
> +0 listen(3, 1) = 0
>
> 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
> 0.200 < . 1:1(0) ack 1 win 257
> 0.200 accept(3, ..., ...) = 4
> +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>
> 0.200 write(4, ..., 14600) = 14600
> 0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
> 0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
>
> 0.200 > .  1:7301(7300) ack 1
> 0.200 > P. 7301:14601(7300) ack 1
>
> 0.300 < . 1:1(0) ack 14601 win 257
> 0.300 > P. 14601:15331(730) ack 1
> 0.300 > P. 15331:16061(730) ack 1
>
> 0.400 < . 1:1(0) ack 16061 win 257
> 0.400 close(4) = 0
> 0.400 > F. 16061:16061(0) ack 1
> 0.400 < F. 1:1(0) ack 16062 win 257
> 0.400 > . 16062:16062(0) ack 2
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Suggested-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

> ---
>  include/net/tcp.h | 8 +++++++-
>  net/ipv4/tcp.c    | 7 +++++--
>  2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 7f2553d..ce08038 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -762,7 +762,8 @@ struct tcp_skb_cb {
>
>         __u8            ip_dsfield;     /* IPv4 tos or IPv6 dsfield     */
>         __u8            txstamp_ack:1,  /* Record TX timestamp for ack? */
> -                       unused:7;
> +                       eor:1,          /* Is skb MSG_EOR marked? */
> +                       unused:6;
>         __u32           ack_seq;        /* Sequence number ACK'd        */
>         union {
>                 struct inet_skb_parm    h4;
> @@ -809,6 +810,11 @@ static inline int tcp_skb_mss(const struct sk_buff *skb)
>         return TCP_SKB_CB(skb)->tcp_gso_size;
>  }
>
> +static inline bool tcp_skb_can_collapse_to(const struct sk_buff *skb)
> +{
> +       return likely(!TCP_SKB_CB(skb)->eor);
> +}
> +
>  /* Events passed to congestion control interface */
>  enum tcp_ca_event {
>         CA_EVENT_TX_START,      /* first transmit when no packets in flight */
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 4d73858..ea5364b 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -908,7 +908,8 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
>                 int copy, i;
>                 bool can_coalesce;
>
> -               if (!tcp_send_head(sk) || (copy = size_goal - skb->len) <= 0) {
> +               if (!tcp_send_head(sk) || (copy = size_goal - skb->len) <= 0 ||
> +                   !tcp_skb_can_collapse_to(skb)) {
>  new_segment:
>                         if (!sk_stream_memory_free(sk))
>                                 goto wait_for_sndbuf;
> @@ -1156,7 +1157,7 @@ int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>                         copy = max - skb->len;
>                 }
>
> -               if (copy <= 0) {
> +               if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
>  new_segment:
>                         /* Allocate new segment. If the interface is SG,
>                          * allocate skb fitting to single page.
> @@ -1250,6 +1251,8 @@ new_segment:
>                 copied += copy;
>                 if (!msg_data_left(msg)) {
>                         tcp_tx_timestamp(sk, sockc.tsflags, skb);
> +                       if (unlikely(flags & MSG_EOR))
> +                               TCP_SKB_CB(skb)->eor = 1;
>                         goto out;
>                 }
>
> --
> 2.5.1
>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox