Netdev List
 help / color / mirror / Atom feed
* [PATCH 2/3] sh_eth: uninline sh_eth_soft_swap()
From: Sergei Shtylyov @ 2018-06-02 19:38 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc
In-Reply-To: <9027499a-0e19-7721-a17f-26e86885da3f@cogentembedded.com>

sh_eth_tsu_soft_swap() is called twice by the driver, remove *inline* and
move  that function  from the header to the driver itself to let gcc decide
whether to expand it inline or not...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.c |   11 +++++++++++
 drivers/net/ethernet/renesas/sh_eth.h |   12 ------------
 2 files changed, 11 insertions(+), 12 deletions(-)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -460,6 +460,17 @@ static u32 sh_eth_tsu_read(struct sh_eth
 	return ioread32(mdp->tsu_addr + offset);
 }
 
+static void sh_eth_soft_swap(char *src, int len)
+{
+#ifdef __LITTLE_ENDIAN
+	u32 *p = (u32 *)src;
+	u32 *maxp = p + ((len + sizeof(u32) - 1) / sizeof(u32));
+
+	for (; p < maxp; p++)
+		*p = swab32(*p);
+#endif
+}
+
 static void sh_eth_select_mii(struct net_device *ndev)
 {
 	struct sh_eth_private *mdp = netdev_priv(ndev);
Index: net-next/drivers/net/ethernet/renesas/sh_eth.h
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
+++ net-next/drivers/net/ethernet/renesas/sh_eth.h
@@ -560,18 +560,6 @@ struct sh_eth_private {
 	unsigned wol_enabled:1;
 };
 
-static inline void sh_eth_soft_swap(char *src, int len)
-{
-#ifdef __LITTLE_ENDIAN
-	u32 *p = (u32 *)src;
-	u32 *maxp;
-	maxp = p + ((len + sizeof(u32) - 1) / sizeof(u32));
-
-	for (; p < maxp; p++)
-		*p = swab32(*p);
-#endif
-}
-
 static inline void *sh_eth_tsu_get_offset(struct sh_eth_private *mdp,
 					  int enum_index)
 {

^ permalink raw reply

* [PATCH 1/3] sh_eth: make sh_eth_soft_swap() work on ARM
From: Sergei Shtylyov @ 2018-06-02 19:37 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc
In-Reply-To: <9027499a-0e19-7721-a17f-26e86885da3f@cogentembedded.com>

Browsing  thru the driver disassembly, I noticed that ARM gcc generated
no  code  whatsoever for sh_eth_soft_swap() while building a little-endian
kernel -- apparently __LITTLE_ENDIAN__ was not being #define'd, however
it got implicitly #define'd when building with the SH gcc (I could only
find the explicit #define __LITTLE_ENDIAN that was #include'd when building
a little-endian kernel).  Luckily, the Ether controller  only doing big-
endian DMA is encountered on the early SH771x SoCs only and all ARM SoCs
implement EDMR.DE and thus set 'sh_eth_cpu_data::hw_swap'. But anyway, we
need to fix the #ifdef inside sh_eth_soft_swap() to something that would
work on all architectures... 

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
 drivers/net/ethernet/renesas/sh_eth.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.h
===================================================================
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
+++ net-next/drivers/net/ethernet/renesas/sh_eth.h
@@ -562,7 +562,7 @@ struct sh_eth_private {
 
 static inline void sh_eth_soft_swap(char *src, int len)
 {
-#ifdef __LITTLE_ENDIAN__
+#ifdef __LITTLE_ENDIAN
 	u32 *p = (u32 *)src;
 	u32 *maxp;
 	maxp = p + ((len + sizeof(u32) - 1) / sizeof(u32));

^ permalink raw reply

* [PATCH 0/3] sh_eth: fix & clean up sh_eth_soft_swap()
From: Sergei Shtylyov @ 2018-06-02 19:32 UTC (permalink / raw)
  To: netdev, David S. Miller; +Cc: linux-renesas-soc

Hello!

Here's a set of 3 patches against DaveM's 'net-next.git' repo. First one fixes an
old buffer endiannes issue (luckily, the ARM SoCs are smart enough to not actually
care) plus couple clean ups around sh_eth_soft_swap()...

[1/1] sh_eth: make sh_eth_soft_swap() work on ARM
[2/3] sh_eth: uninline sh_eth_soft_swap()
[3/3] sh_eth: use DIV_ROUND_UP() in sh_eth_soft_swap()

MBR, Sergei

^ permalink raw reply

* Re: [PATCH bpf-next 1/2] bpf: btf: Check array t->size
From: Alexei Starovoitov @ 2018-06-02 18:31 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180602160651.3960897-1-kafai@fb.com>

On Sat, Jun 02, 2018 at 09:06:50AM -0700, Martin KaFai Lau wrote:
> This patch ensures array's t->size is 0.
> 
> The array size is decided by its individual elem's size and the
> number of elements.  Hence, t->size is not used and
> it must be 0.
> 
> A test case is added to test_btf.c
> 
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>

Both applied.
Please provide cover letter with short description next time
when sending series.
Thanks

^ permalink raw reply

* [PATCH net-next 2/2] mlxsw: spectrum_span: Suppress VLAN on BRIDGE_VLAN_INFO_UNTAGGED
From: Ido Schimmel @ 2018-06-02 18:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180602180935.24544-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

When offloading mirroring to gretap or ip6gretap netdevices, an 802.1q
bridge is one of the soft devices permissible in the underlay when
resolving the packet path. After the packet path is resolved to a
particular bridge egress device, flags on packet VLAN determine whether
the egressed packet should be tagged.

The current logic however only ever sets the VLAN tag, never suppresses
it. Thus if there's a VLAN netdevice above the bridge that determines
the packet VLAN, that VLAN is never unset, and mirroring is configured
with VLAN tagging.

Fix by setting the packet VLAN on both branches: set to zero (for unset)
when BRIDGE_VLAN_INFO_UNTAGGED, copy the resolved VLAN (e.g. from bridge
PVID) otherwise.

Fixes: 946a11e7408e ("mlxsw: spectrum_span: Allow bridge for gretap mirror")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
index da3f7f527360..3d187d88cc7c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -191,7 +191,9 @@ mlxsw_sp_span_entry_bridge_8021q(const struct net_device *br_dev,
 
 	if (br_vlan_get_info(edev, vid, &vinfo))
 		return NULL;
-	if (!(vinfo.flags & BRIDGE_VLAN_INFO_UNTAGGED))
+	if (vinfo.flags & BRIDGE_VLAN_INFO_UNTAGGED)
+		*p_vid = 0;
+	else
 		*p_vid = vid;
 	return edev;
 }
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 1/2] mlxsw: spectrum_switchdev: Postpone respin on object deletion
From: Ido Schimmel @ 2018-06-02 18:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180602180935.24544-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

VLAN deletion notifications are emitted before the relevant change is
projected to bridge configuration. Thus, like with VLAN addition,
schedule SPAN respin for later.

Fixes: c520bc698647 ("mlxsw: Respin SPAN on switchdev events")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 8a15ac49cb5a..e97652c40d13 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -1856,7 +1856,7 @@ static int mlxsw_sp_port_obj_del(struct net_device *dev,
 		break;
 	}
 
-	mlxsw_sp_span_respin(mlxsw_sp_port->mlxsw_sp);
+	mlxsw_sp_span_respin_schedule(mlxsw_sp_port->mlxsw_sp);
 
 	return err;
 }
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 0/2] mlxsw: Fixes in offloading of mirror-to-gretap
From: Ido Schimmel @ 2018-06-02 18:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, petrm, mlxsw, Ido Schimmel

Petr says:

These two patches fix issues in offloading of mirror-to-gretap when
bridge is present in the underlay.

In patch #1, reconsideration of SPAN configuration is not done right at
the point that SWITCHDEV_OBJ_ID_PORT_VLAN deletion notification is
distributed, but is postponed, because the notifications are actually
distributed before the relevant change is implemented in the bridge.

In patch #2, a problem in configuring VLAN tagging in situations when a
VLAN device is on top of an 802.1Q bridge whose egress port is marked as
"egress untagged". In that case, mlxsw would neglect to suppress the
tagging implicitly assumed after the VLAN device was seen.

Petr Machata (2):
  mlxsw: spectrum_switchdev: Postpone respin on object deletion
  mlxsw: spectrum_span: Suppress VLAN on BRIDGE_VLAN_INFO_UNTAGGED

 drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c      | 4 +++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

-- 
2.14.3

^ permalink raw reply

* Dear Talented
From: Lisa Clement @ 2018-06-02 17:11 UTC (permalink / raw)
  To: Recipients

Dear Talented,

I am Talent Scout For BLUE SKY FILM STUDIO, Present Blue sky Studio a
Film Corporation Located in the United State, is Soliciting for the
Right to use Your Photo/Face and Personality as One of the Semi -Major
Role/ Character in our Upcoming ANIMATED Stereoscope 3D Movie-The Story
of Spies in Disguise (Spies in Disguise 2019) The Movie is Currently Filming (In Production) Please Note That There Will Be No Auditions, Traveling or Any Special / Professional Acting Skills, Since the Production of This Movie Will Be Done with our State of Art Computer -Generating Imagery Equipment. We Are Prepared to Pay the Total Sum of $620,000.00 USD. For More Information/Understanding, Please Write us on the E-Mail Below. CONTACT EMAIL: bluesky.filmstudio@usa.com
All Reply to: bluesky.filmstudio@usa.com
Note: Only the Response send to this mail will be Given a Prior Consideration.

Talent Scout
Lisa Clement 

^ permalink raw reply

* Re: ANNOUNCE: Enhanced IP v1.4
From: Willy Tarreau @ 2018-06-02 17:02 UTC (permalink / raw)
  To: Sam Patton; +Cc: netdev
In-Reply-To: <330e58f3-61d3-6abc-4f7c-1726e0ce852d@enhancedip.org>

On Sat, Jun 02, 2018 at 12:17:12PM -0400, Sam Patton wrote:
> As far as application examples, check out this simple netcat-like
> program I use for testing:
> 
> https://github.com/EnIP/enhancedip/blob/master/userspace/netcat/netcat.c
> 
> Lines 61-67 show how to connect directly via an EnIP address.  The
> netcat-like application uses
> 
> a header file called eip.h.

OK so basically we need to use a new address structure (which makes sense),
however I'm not sure I understand how the system is supposed to know it has
to use EnIP instead of IPv4 if the sin_family remains the same :

...
   memset(&serv_addr, 0, sizeof(struct sockaddr_ein));
   serv_addr.sin_family= AF_INET;
   serv_addr.sin_port = htons(portno);
   serv_addr.sin_addr1=inet_addr(argv[1]);
   serv_addr.sin_addr2=inet_addr(argv[2]);

   if (connect(sockfd,(struct sockaddr *) &serv_addr,sizeof(serv_addr)) < 0) 
	error("ERROR connecting");
...

Does it solely rely on the 3rd argument to connect() to know that it
may read sin_addr2 ? If so, that sounds a bit fragile to me because I
*think* (but could be wrong) that right now any size at least as large
as sockaddr_in work for IPv4 (typically I think if you pass a struct
sockaddr_storage with its size it should work). So there would be a
risk of breaking applications if this is the case.

I suspect that using a different family would make things more robust
and more friendly to applications. But that's just a first opinion,
you have worked longer than me on this.

>  You can look at it here:
> 
> https://github.com/EnIP/enhancedip/blob/master/userspace/include/eip.h

Thank you. There's a problem with your #pragma pack there, it will
propagate to all struct declared after this file is included and
will force all subsequent structs to be packed. It would be preferable
to use __attribute__((packed)) on the struct or to use pragma pack()
safely (I've just seen it supports push/pop options).

> EnIP makes use of IPv6 AAAA records for DNS lookup.  We simply put
> 2001:0101 (which is an IPv6 experimental prefix) and
> 
> then we put the 64-bit EnIP address into the next 8 bytes of the
> address.  The remaining bytes are set to zero.

Does this require a significant amount of changes to the libc ?

> In the kernel, if you want to see how we convert the IPv6 DNS lookup
> into something connect() can manage,
> 
> check out the add_enhanced_ip() routine found here:
> 
> https://github.com/EnIP/enhancedip/blob/master/kernel/4.9.28/socket.c

Hmmm so both IPv4 and IPv6 addresses are supported and used together.
So that just makes me think that if in the end you need a little bit
of IPv6 to handle this, why not instead define a new way to map IPv6 to
Enhanced IPv4 ? I mean, you'd exclusively use AF_INET6 in applications,
within a dedicated address range, allowing you to naturally use all of
the IPv6 ecosystem in place in userland, but use IPv4 over the wire.
This way applications would possibly require even less modififications,
if any at all because they'd accept/connect/send/receive data for IPv6
address families that they all deal with nowadays, but this would rely
on an omni-present IPv4 infrastructure.

> The reason we had to do changes for openssh and not other applications
> (that use DNS) is openssh has a check to
> 
> see if the socket is using IP options.  If the socket does, sshd drops
> the connection.  I had to work around that to get openssh working

OK. They possibly do this to avoid source-routing.

> with EnIP.  The result: if you want to connect the netcat-like program
> with IP addresses you'll end up doing something like the
> 
> example above.  If you're using DNS (getaddrinfo) to connect(), it
> should just work (except for sshd as noted).
> 
> Here's the draft experimental RFC:
> https://tools.ietf.org/html/draft-chimiak-enhanced-ipv4-03

Yep I've found it on the site.

> I'll also note that I am doing this code part time as a hobby for a long
> time so I appreciate your help and support.  It would be really
> 
> great if the kernel community decided to pick this up, but if it's not a
> reality please let me know soonest so I can move on to a
> 
> different hobby.  :)

I *personally* think there is some value in what you're experimenting
with, and doing it as a hobby can leave you with the time needed to
collect pros and cons from various people. I'm just thinking that in
order to get some adoption, you need to be extremely careful not to
break any of the IPv4 applications developed in the last 38 years,
and by this, AF_INET+sizeof() scares me a little bit.

Regards,
Willy

^ permalink raw reply

* Re: [PATCH 4/4] cpsw: add switchdev support
From: Ilias Apalodimas @ 2018-06-02 16:52 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar,
	jiri, ivecera, francois.ozog, yogeshs, spatton
In-Reply-To: <A6381ECC-4C7C-4CCB-88B0-9FC77FE18F66@gmail.com>

On Sat, Jun 02, 2018 at 09:10:08AM -0700, Florian Fainelli wrote:
> On June 2, 2018 3:34:32 AM MST, Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:
> >Hi Florian, 
> >
> >Thanks for taking time to look into this
> >
> >On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote:
> >> 
> >> 
> >> On 05/24/2018 09:56 PM, Ilias Apalodimas wrote:
> >> > On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote:
> >> >> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote:
> >> >>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote:
> >> >>>> Device tree is supposed to describe the hardware. Using that
> >hardware
> >> >>>> in different ways is not something you should describe in DT.
> >> >>>>
> >> >>> The new switchdev mode is applied with a .config option in the
> >kernel. What you
> >> >>> see is pre-existing code, so i am not sure if i should change it
> >in this
> >> >>> patchset.
> >> >>
> >> >> If you break the code up into a library and two drivers, it
> >becomes a
> >> >> moot point.
> >> > Agree
> >> > 
> >> >>
> >> >> But what i don't like here is that the device tree says to do dual
> >> >> mac. But you ignore that and do sometime else. I would prefer that
> >if
> >> >> DT says dual mac, and switchdev is compiled in, the probe fails
> >with
> >> >> EINVAL. Rather than ignore something, make it clear it is invalid.
> >> > The switch has 3 modes of operation as is.
> >> > 1. switch mode, to enable that you don't need to add anything on
> >> > the DTS and linux registers a single netdev interface.
> >> > 2. dual mac mode, this is when you need to add dual_emac; on the
> >DTS.
> >> > 3. switchdev mode which is controlled by a .config option, since as
> >you 
> >> > pointed out DTS was not made for controlling config options. 
> >> > 
> >> > I agree that this is far from beautiful. If the driver remains as
> >in though,
> >> > i'd prefer either keeping what's there or making "switchdev" a DTS
> >option, 
> >> > following the pre-existing erroneous usage rather than making the
> >device 
> >> > unusable.  If we end up returning some error and refuse to
> >initialize, users 
> >> > that remote upgrade their equipment, without taking a good look at
> >changelog,
> >> > will loose access to their devices with no means of remotely fixing
> >that.
> >> 
> >> It seems to me that the mistake here is seeing multiple modes of
> >> operations for the cpsw. There are not actually many, there is one
> >> usage, and then there is what you can and cannot offload. 
> >CPSW has in fact 2 modes of operation, different FIFO usage/lookup
> >entry(it's
> >called ALE in the current driver) by-pass(which is used in dual emac
> >for 
> >example) and other features. Again Grygorii is better suited to answer
> >the 
> >exact differences.
> >> The basic
> >> premise with switchdev and DSA (which uses switchdev) is that each
> >> user-facing port of your switch needs to work as if it were a normal
> >> Ethernet NIC, that is what you call dual-MAC I believe. Then, when
> >you
> >> create a bridge and you enslave those ports into the bridge, you need
> >to
> >> have forwarding done in hardware between these two ports when the
> >> SMAC/DMAC are not for the host/CPU/management interface and you must
> >> simultaneously still have the host have the ability to send/receive
> >> traffic through the bridge device.
> >Yes dual emac does that. But dual emac configures the port facing VLAN
> >to the
> >CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is
> >configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU
> >port
> >That's exactly what the current RFC does as well, with the addition of
> >registering a sw0p0 (i'll explain why later on this mail)
> >A little more detail on the issue we are having. On my description 
> >sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the
> >ports
> >that have PHYs attached. 
> >
> >When we start in the new switchdev mode all interfaces are added to
> >VLAN 0
> >so CPU port + port1 + port2 are all in the same VLAN group. In that
> >case sw0p1
> >and sw0p2 are working as you describe. So those 2 interfaces can
> >send/receive
> >traffic normally which matches the switchdev case.
> >
> >When we add them on a bridge(let's say br0), VLAN1(or any default
> >bridge VLAN)
> >is now configured on sw0p1 and sw0p2 but *not* on the CPU port. 
> >From this point on the whole fuunctionality just collapses. The switch
> >will 
> >work and offload traffic between sw0p1/sw0p2 but the bridge won't be
> >able to 
> >get an ip address (since VLAN1 is not a member of the CPU port and the
> >packet 
> >gets dropped). 
> >IGMPv2/V3 messages will never reach the br_multicast.c code to trigger 
> >switchdev and configure the MDBs on the ports.  i am prety sure there
> >are other
> >fail scenarios which i haven't discovered already, but those 2 are the
> >most 
> >basic ones.  If we add VLAN1 on the CPU port, everything works as
> >intended of 
> >course.
> >
> >That's the reason we registered sw0p0 as the CPU port. It can't do any
> >"real"
> >traffic, but you can configure the CPU port independantly and not be
> >forced to
> >do an OR on every VLAN add/delete grouping the CPU port with your port
> >command.
> >The TL;DR version of this is that the switch is working exactly as
> >switchdev is
> >expecting offloading traffic to the hardware when possible as long as
> >the CPU
> >port is member of the proper VLANs
> >
> >Petr's patch solves this for us
> >(9c86ce2c1ae337fc10568a12aea812ed03de8319).
> >We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and
> >decide
> >when to add the CPU port or not. 
> >
> >There are still a couple of cases that are not covered though, if we
> >don't 
> >register the CPU port. We cant decide when to forward multicast
> >traffic on the CPU port if a join hasn't been sent from br0.
> >So let's say you got 2 hosts doing multicast and for whatever reason
> >the host
> >wants to see that traffic. 
> >With the CPU port present you can do a 
> >"bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will
> >offload
> >the traffic to the CPU port and thus the host. If this goes away we are
> >forced
> >to send a join.
> 
> Thanks for the detailed explanation. Somehow I was under the impression that cpsw had the ability, through specific DMA descriptor bits to direct traffic towards one external port or another and conversely, have that information from the HW when receiving packets. 
That's one mode of operation when by-passing the ALE if my understanding of
the hardware is correct. You can choose not to do that. I am still earning all
the details of the ahrdware myself. On Rx though you still need the CPU to
participate to receive the packet(and yes the descriptor indicates the port)

> What you describe is exactly the same problem we have in DSA when the switch advertises DSA_TAG_PROTO_NONE where only VLAN tags could help differentiate traffic from external ports. At some point there was a discuss of making DSA_TAG_PROTO_NONE automatically create one VLAN per port but this is a good source for other problems...
I am pretty sure i am not the first one to encounter this kind of problems
> 
> Looking forward to your follow-up patch series!
Will do!
> 
> -- 
> Florian

Thanks
Ilias

^ permalink raw reply

* Re: [PATCH net] packet: fix reserve calculation
From: Lennert Buytenhek @ 2018-06-02 16:35 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, davem, Willem de Bruijn
In-Reply-To: <20180524221030.158150-1-willemdebruijn.kernel@gmail.com>

On Thu, May 24, 2018 at 06:10:30PM -0400, Willem de Bruijn wrote:

> From: Willem de Bruijn <willemb@google.com>
> 
> Commit b84bbaf7a6c8 ("packet: in packet_snd start writing at link
> layer allocation") ensures that packet_snd always starts writing
> the link layer header in reserved headroom allocated for this
> purpose.
> 
> This is needed because packets may be shorter than hard_header_len,
> in which case the space up to hard_header_len may be zeroed. But
> that necessary padding is not accounted for in skb->len.
> 
> The fix, however, is buggy. It calls skb_push, which grows skb->len
> when moving skb->data back. But in this case packet length should not
> change.
> 
> Instead, call skb_reserve, which moves both skb->data and skb->tail
> back, without changing length.
> 
> Fixes: b84bbaf7a6c8 ("packet: in packet_snd start writing at link layer allocation")
> Reported-by: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

After upgrading my router from 4.16.11 to 4.16.12, it is failing to
obtain a DHCP lease from my ISP, as it started sending out DHCP queries
with 14 bytes of junk at the end (which is presumably causing RX csum
failures on the DHCP server end):

 13:08:39.292667 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328)
     0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 300, xid 0xxxxxxxxx, Flags [none] (0x0000)
[...]
-        0x0150:  0000 0000 0000
+        0x0150:  0000 0000 0000 e802 0000 0000 0000 e802
+        0x0160:  0000 0000

This seems to be caused by (the -stable backport of) b84bbaf7a6c8
("packet: in packet_snd start writing at link layer allocation") and
appears to have been fixed by this patch, as applying this patch to
4.16.12 makes DHCP work for me again.

Tested-by: Lennert Buytenhek <buytenh@wantstofly.org>



> ---
>  net/packet/af_packet.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index e9422fe45179..acb7b86574cd 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2911,7 +2911,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
>  		if (unlikely(offset < 0))
>  			goto out_free;
>  	} else if (reserve) {
> -		skb_push(skb, reserve);
> +		skb_reserve(skb, -reserve);
>  	}
>  
>  	/* Returns -EFAULT on error */
> -- 
> 2.17.0.921.gf22659ad46-goog

^ permalink raw reply

* Re: ANNOUNCE: Enhanced IP v1.4
From: Sam Patton @ 2018-06-02 16:17 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: netdev
In-Reply-To: <20180602055717.GB17899@1wt.eu>

Hello Willy, netdev,

Thank you for your reply and advice.  I couldn't agree more with you
about containers and the exciting prospects there,

as well as the ADSL scenario you mention.

As far as application examples, check out this simple netcat-like
program I use for testing:

https://github.com/EnIP/enhancedip/blob/master/userspace/netcat/netcat.c

Lines 61-67 show how to connect directly via an EnIP address.  The
netcat-like application uses

a header file called eip.h.  You can look at it here:

https://github.com/EnIP/enhancedip/blob/master/userspace/include/eip.h

EnIP makes use of IPv6 AAAA records for DNS lookup.  We simply put
2001:0101 (which is an IPv6 experimental prefix) and

then we put the 64-bit EnIP address into the next 8 bytes of the
address.  The remaining bytes are set to zero.

In the kernel, if you want to see how we convert the IPv6 DNS lookup
into something connect() can manage,

check out the add_enhanced_ip() routine found here:

https://github.com/EnIP/enhancedip/blob/master/kernel/4.9.28/socket.c

The reason we had to do changes for openssh and not other applications
(that use DNS) is openssh has a check to

see if the socket is using IP options.  If the socket does, sshd drops
the connection.  I had to work around that to get openssh working

with EnIP.  The result: if you want to connect the netcat-like program
with IP addresses you'll end up doing something like the

example above.  If you're using DNS (getaddrinfo) to connect(), it
should just work (except for sshd as noted).

Here's the draft experimental RFC:
https://tools.ietf.org/html/draft-chimiak-enhanced-ipv4-03

I'll also note that I am doing this code part time as a hobby for a long
time so I appreciate your help and support.  It would be really

great if the kernel community decided to pick this up, but if it's not a
reality please let me know soonest so I can move on to a

different hobby.  :)

Thank you.

Sam Patton

On 6/2/18 1:57 AM, Willy Tarreau wrote:
> Hello Sam,
>
> On Fri, Jun 01, 2018 at 09:48:28PM -0400, Sam Patton wrote:
>> Hello!
>>
>> If you do not know what Enhanced IP is, read this post on netdev first:
>>
>> https://www.spinics.net/lists/netdev/msg327242.html
>>
>>
>> The Enhanced IP project presents:
>>
>>              Enhanced IP v1.4
>>
>> The Enhanced IP (EnIP) code has been updated.  It now builds with OpenWRT barrier breaker (for 148 different devices). We've been testing with the Western Digital N600 and N750 wireless home routers.
> (...) First note, please think about breaking your lines if you want your
> mails to be read by the widest audience, as for some of us here, reading
> lines wider than a terminal is really annoying, and often not considered
> worth spending time on them considering there are so many easier ones
> left to read.
>
>> Interested in seeing Enhanced IP in the Linux kernel, read on.  Not
>> interested in seeing Enhanced IP in the Linux kernel read on.
> (...)
>
> So I personally find the concept quite interesting. It reminds me of the
> previous IPv5/IPv7/IPv8 initiatives, which in my opinion were a bit hopeless.
> Here the fact that you decide to consider the IPv4 address as a network opens
> new perspectives. For containerized environments it could be considered that
> each server, with one IPv4, can host 2^32 guests and that NAT is not needed
> anymore for example. It could also open the possibility that enthousiasts
> can more easily host some services at home behind their ADSL line without
> having to run on strange ports.
>
> However I think your approach is not the most efficient to encourage adoption.
> It's important to understand that there will be little incentive for people
> to patch their kernels to run some code if they don't have the applications
> on top of it. The kernel is not the end goal for most users, the kernel is
> just the lower layer needed to run applications on top. I looked at your site
> and the github repo, and all I could find was a pre-patched openssh, no simple
> explanation of what to change in an application.
>
> What you need to do first is to *explain* how to modify userland applications
> to support En-IP, provide an echo server and show the parts which have to be
> changed. Write a simple client and do the same. Provide your changes to
> existing programs as patches, not as pre-patched code. This way anyone can
> use your patches on top of other versions, and can use these patches to
> understand what has to be modified in their applications.
>
> Once applications are easy to patch, the incentive to install patched kernels
> everywhere will be higher. For many enthousiasts, knowing that they only have
> to modify the ADSL router to automatically make their internal IoT stuff
> accessible from outside indeed becomes appealing.
>
> Then you'll need to provide patches for well known applications like curl,
> wget, DNS servers (bind...), then browsers.
>
> In my case I could be interested in adding support for En-ip into haproxy,
> and only once I don't see any showstopped in doing this, I'd be willing to
> patch my kernel to support it.
>
> Last advice, provide links to your drafts in future e-mails, they are not
> easy to find on your site, we have to navigate through various pages to
> finally find them.
>
> Regards,
> Willy

^ permalink raw reply

* Re: [PATCH 4/4] cpsw: add switchdev support
From: Florian Fainelli @ 2018-06-02 16:10 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Andrew Lunn, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar,
	jiri, ivecera, francois.ozog, yogeshs, spatton
In-Reply-To: <20180602103432.GA948@apalos>

On June 2, 2018 3:34:32 AM MST, Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:
>Hi Florian, 
>
>Thanks for taking time to look into this
>
>On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote:
>> 
>> 
>> On 05/24/2018 09:56 PM, Ilias Apalodimas wrote:
>> > On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote:
>> >> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote:
>> >>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote:
>> >>>> Device tree is supposed to describe the hardware. Using that
>hardware
>> >>>> in different ways is not something you should describe in DT.
>> >>>>
>> >>> The new switchdev mode is applied with a .config option in the
>kernel. What you
>> >>> see is pre-existing code, so i am not sure if i should change it
>in this
>> >>> patchset.
>> >>
>> >> If you break the code up into a library and two drivers, it
>becomes a
>> >> moot point.
>> > Agree
>> > 
>> >>
>> >> But what i don't like here is that the device tree says to do dual
>> >> mac. But you ignore that and do sometime else. I would prefer that
>if
>> >> DT says dual mac, and switchdev is compiled in, the probe fails
>with
>> >> EINVAL. Rather than ignore something, make it clear it is invalid.
>> > The switch has 3 modes of operation as is.
>> > 1. switch mode, to enable that you don't need to add anything on
>> > the DTS and linux registers a single netdev interface.
>> > 2. dual mac mode, this is when you need to add dual_emac; on the
>DTS.
>> > 3. switchdev mode which is controlled by a .config option, since as
>you 
>> > pointed out DTS was not made for controlling config options. 
>> > 
>> > I agree that this is far from beautiful. If the driver remains as
>in though,
>> > i'd prefer either keeping what's there or making "switchdev" a DTS
>option, 
>> > following the pre-existing erroneous usage rather than making the
>device 
>> > unusable.  If we end up returning some error and refuse to
>initialize, users 
>> > that remote upgrade their equipment, without taking a good look at
>changelog,
>> > will loose access to their devices with no means of remotely fixing
>that.
>> 
>> It seems to me that the mistake here is seeing multiple modes of
>> operations for the cpsw. There are not actually many, there is one
>> usage, and then there is what you can and cannot offload. 
>CPSW has in fact 2 modes of operation, different FIFO usage/lookup
>entry(it's
>called ALE in the current driver) by-pass(which is used in dual emac
>for 
>example) and other features. Again Grygorii is better suited to answer
>the 
>exact differences.
>> The basic
>> premise with switchdev and DSA (which uses switchdev) is that each
>> user-facing port of your switch needs to work as if it were a normal
>> Ethernet NIC, that is what you call dual-MAC I believe. Then, when
>you
>> create a bridge and you enslave those ports into the bridge, you need
>to
>> have forwarding done in hardware between these two ports when the
>> SMAC/DMAC are not for the host/CPU/management interface and you must
>> simultaneously still have the host have the ability to send/receive
>> traffic through the bridge device.
>Yes dual emac does that. But dual emac configures the port facing VLAN
>to the
>CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is
>configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU
>port
>That's exactly what the current RFC does as well, with the addition of
>registering a sw0p0 (i'll explain why later on this mail)
>A little more detail on the issue we are having. On my description 
>sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the
>ports
>that have PHYs attached. 
>
>When we start in the new switchdev mode all interfaces are added to
>VLAN 0
>so CPU port + port1 + port2 are all in the same VLAN group. In that
>case sw0p1
>and sw0p2 are working as you describe. So those 2 interfaces can
>send/receive
>traffic normally which matches the switchdev case.
>
>When we add them on a bridge(let's say br0), VLAN1(or any default
>bridge VLAN)
>is now configured on sw0p1 and sw0p2 but *not* on the CPU port. 
>From this point on the whole fuunctionality just collapses. The switch
>will 
>work and offload traffic between sw0p1/sw0p2 but the bridge won't be
>able to 
>get an ip address (since VLAN1 is not a member of the CPU port and the
>packet 
>gets dropped). 
>IGMPv2/V3 messages will never reach the br_multicast.c code to trigger 
>switchdev and configure the MDBs on the ports.  i am prety sure there
>are other
>fail scenarios which i haven't discovered already, but those 2 are the
>most 
>basic ones.  If we add VLAN1 on the CPU port, everything works as
>intended of 
>course.
>
>That's the reason we registered sw0p0 as the CPU port. It can't do any
>"real"
>traffic, but you can configure the CPU port independantly and not be
>forced to
>do an OR on every VLAN add/delete grouping the CPU port with your port
>command.
>The TL;DR version of this is that the switch is working exactly as
>switchdev is
>expecting offloading traffic to the hardware when possible as long as
>the CPU
>port is member of the proper VLANs
>
>Petr's patch solves this for us
>(9c86ce2c1ae337fc10568a12aea812ed03de8319).
>We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and
>decide
>when to add the CPU port or not. 
>
>There are still a couple of cases that are not covered though, if we
>don't 
>register the CPU port. We cant decide when to forward multicast
>traffic on the CPU port if a join hasn't been sent from br0.
>So let's say you got 2 hosts doing multicast and for whatever reason
>the host
>wants to see that traffic. 
>With the CPU port present you can do a 
>"bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will
>offload
>the traffic to the CPU port and thus the host. If this goes away we are
>forced
>to send a join.

Thanks for the detailed explanation. Somehow I was under the impression that cpsw had the ability, through specific DMA descriptor bits to direct traffic towards one external port or another and conversely, have that information from the HW when receiving packets. What you describe is exactly the same problem we have in DSA when the switch advertises DSA_TAG_PROTO_NONE where only VLAN tags could help differentiate traffic from external ports. At some point there was a discuss of making DSA_TAG_PROTO_NONE automatically create one VLAN per port but this is a good source for other problems...

Looking forward to your follow-up patch series!

-- 
Florian

^ permalink raw reply

* [PATCH bpf-next 2/2] bpf: btf: Ensure t->type == 0 for BTF_KIND_FWD
From: Martin KaFai Lau @ 2018-06-02 16:06 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180602160651.3960897-1-kafai@fb.com>

The t->type in BTF_KIND_FWD is not used.  It must be 0.
This patch ensures that and also adds a test case in test_btf.c

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/btf.c                       | 21 ++++++++++++++++++++-
 tools/testing/selftests/bpf/test_btf.c | 22 ++++++++++++++++++++++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 84ad532f2854..8653ab004c73 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -1286,8 +1286,27 @@ static struct btf_kind_operations ptr_ops = {
 	.seq_show = btf_ptr_seq_show,
 };
 
+static s32 btf_fwd_check_meta(struct btf_verifier_env *env,
+			      const struct btf_type *t,
+			      u32 meta_left)
+{
+	if (btf_type_vlen(t)) {
+		btf_verifier_log_type(env, t, "vlen != 0");
+		return -EINVAL;
+	}
+
+	if (t->type) {
+		btf_verifier_log_type(env, t, "type != 0");
+		return -EINVAL;
+	}
+
+	btf_verifier_log_type(env, t, NULL);
+
+	return 0;
+}
+
 static struct btf_kind_operations fwd_ops = {
-	.check_meta = btf_ref_type_check_meta,
+	.check_meta = btf_fwd_check_meta,
 	.resolve = btf_df_resolve,
 	.check_member = btf_df_check_member,
 	.log_details = btf_ref_type_log,
diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index fd8246e84149..3619f3023088 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -1242,6 +1242,28 @@ static struct btf_raw_test raw_tests[] = {
 	.err_str = "Invalid btf_info",
 },
 
+{
+	.descr = "fwd test. t->type != 0\"",
+	.raw_types = {
+		/* int */				/* [1] */
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),
+		/* fwd type */				/* [2] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FWD, 0, 0), 1),
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "fwd_test_map",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+	.btf_load_err = true,
+	.err_str = "type != 0",
+},
+
 }; /* struct btf_raw_test raw_tests[] */
 
 static const char *get_next_str(const char *start, const char *end)
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf-next 1/2] bpf: btf: Check array t->size
From: Martin KaFai Lau @ 2018-06-02 16:06 UTC (permalink / raw)
  To: netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team

This patch ensures array's t->size is 0.

The array size is decided by its individual elem's size and the
number of elements.  Hence, t->size is not used and
it must be 0.

A test case is added to test_btf.c

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/btf.c                       |  5 +++++
 tools/testing/selftests/bpf/test_btf.c | 23 +++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 3d20aa1f4b54..84ad532f2854 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -1342,6 +1342,11 @@ static s32 btf_array_check_meta(struct btf_verifier_env *env,
 		return -EINVAL;
 	}
 
+	if (t->size) {
+		btf_verifier_log_type(env, t, "size != 0");
+		return -EINVAL;
+	}
+
 	/* Array elem type and index type cannot be in type void,
 	 * so !array->type and !array->index_type are not allowed.
 	 */
diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index 35064df688c1..fd8246e84149 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -1179,6 +1179,29 @@ static struct btf_raw_test raw_tests[] = {
 },
 
 {
+	.descr = "array test. t->size != 0\"",
+	.raw_types = {
+		/* int */				/* [1] */
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),
+		/* int[16] */				/* [2] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_ARRAY, 0, 0), 1),
+		BTF_ARRAY_ENC(1, 1, 16),
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "array_test_map",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+	.btf_load_err = true,
+	.err_str = "size != 0",
+},
+
+{
 	.descr = "int test. invalid int_data",
 	.raw_types = {
 		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_INT, 0, 0), 4),
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH 10/18] rhashtable: remove rhashtable_walk_peek()
From: Herbert Xu @ 2018-06-02 15:48 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel, Tom Herbert
In-Reply-To: <152782824964.30340.6329146982899668633.stgit@noble>

On Fri, Jun 01, 2018 at 02:44:09PM +1000, NeilBrown wrote:
> This function has a somewhat confused behavior that is not properly
> described by the documentation.
> Sometimes is returns the previous object, sometimes it returns the
> next one.
> Sometimes it changes the iterator, sometimes it doesn't.
> 
> This function is not currently used and is not worth keeping, so
> remove it.
> 
> A future patch will introduce a new function with a
> simpler interface which can meet the same need that
> this was added for.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

Please keep Tom Herbert in the loop.  IIRC he had an issue with
this patch.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/4] RFC CPSW switchdev mode
From: Andrew Lunn @ 2018-06-02 14:08 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Ilias Apalodimas, netdev, ivan.khoronzhuk, nsekhar, jiri, ivecera,
	francois.ozog, yogeshs, spatton
In-Reply-To: <4277c55c-bea2-1061-35e9-62d1d0f60d28@ti.com>

On Fri, Jun 01, 2018 at 04:29:08PM -0500, Grygorii Strashko wrote:
> Hi Ilias,


> Second, Thanks a lot for your great work. I'm still testing it with different
>  use cases and trying to consolidate my reply for all questions.
> 
> All, thanks for your comments.

Hi Grygorii

Something i've said to Ilias already. I would recommend you don't try
to cover all your uses cases with the first version. Keep it simple
and clean, don't do anything controversial and get it merged. Then add
more features one by one. We can then discuss any odd ball features
while being able to look at the complete system, driver, switchdev and
the network stack.

    Andrew

^ permalink raw reply

* Donation!!!
From: Mr Mikhail Fridman @ 2018-06-02 11:52 UTC (permalink / raw)
  To: Recipients

Charitable Donation for you, Respond for further directives.

Mr. Mikhail Fridman

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply

* Re: [PATCH V2 mlx5-next 0/2] Mellanox, mlx5 new device events
From: David Miller @ 2018-06-02 13:07 UTC (permalink / raw)
  To: saeedm; +Cc: jgg, netdev, linux-rdma, leonro
In-Reply-To: <a6385f91ca1611184391b1949f26f531ae812911.camel@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Sat, 2 Jun 2018 00:17:37 +0000

> On Fri, 2018-06-01 at 17:13 -0700, saeedm@mellanox.com wrote:
>> Hey Dave, this series was meant to go to mlx5-next tree as stated in
>> the cover letter and patches titles "[PATCH V2 mlx5-next 0/2]".
>> 
>> It is ok you applied those patches to net-next this time, but for
>> next
>> time I would like to apply them myself to mlx5-next and send a clean
>> pull request later to both rdma and net trees.
> 
> Sorry please ignore, I just saw that this was sorted out .. :).
> Thanks A lot !
> 
>> Is there a way for me to mark them as delegated in patchwork ?
>> I see that i have "update Properties" option in patchwork but i don't
>> know how to use it or whether I am allowed to do anything with it.
> 
> It would be great if we can mark such patches in patchwork though.. 

I really don't like it when others adjust patchwork state on me.
Florian Fainelli did it for a brief time for some of his own patches
and it caused a lot of confusion for me, so I asked him to stop.

Let's just make sure that the Subject line says mlx5-next or whatever
instead of net-next and I'll be mindful in such situations.

Thanks.

^ permalink raw reply

* Re: [PATCH 00/20] Netfilter/IPVS updates for net-next
From: David Miller @ 2018-06-02 13:04 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <20180602002259.4024-1-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Sat,  2 Jun 2018 02:22:39 +0200

> The following patchset contains Netfilter/IPVS updates for your net-next
> tree, the most relevant things in this batch are:
 ...
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Pulled, thanks.

^ permalink raw reply

* Re: [pull request][net-next 00/17] Mellanox, mlx5e updates 2018-06-01
From: David Miller @ 2018-06-02 12:55 UTC (permalink / raw)
  To: saeedm; +Cc: netdev
In-Reply-To: <20180602000544.18717-1-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Fri,  1 Jun 2018 17:05:27 -0700

> Sorry for the extra 2 patches in this series, but mostly the series
> contains small patches and some fixes to previous patches in this
> submission window, with one main patch from Tariq to improve legacy
> RQ buffer management, for more information please refer to that tag
> log below.
> 
> Please pull and let me know if there's any problem.

Pulled, thanks Saeed.

^ permalink raw reply

* Re: pull-request: bpf 2018-06-02
From: David Miller @ 2018-06-02 12:10 UTC (permalink / raw)
  To: daniel; +Cc: ast, netdev
In-Reply-To: <20180602050722.24694-1-daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Sat,  2 Jun 2018 07:07:22 +0200

> The following pull-request contains BPF updates for your *net* tree.
> 
> The main changes are:
> 
> 1) BPF uapi fix in struct bpf_prog_info and struct bpf_map_info in
>    order to fix offsets on 32 bit archs.
> 
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
> 
> This will have a minor merge conflict with net-next which has the
> __u32 gpl_compatible:1 bitfield in struct bpf_prog_info at this
> location. Resolution is to use the gpl_compatible member.

Pulled, thanks Daniel.

^ permalink raw reply

* Lucrative Business Proposal
From: Adrien Saif @ 2018-06-02  8:51 UTC (permalink / raw)




-- 
Dear Friend,

I would like to discuss a very important issue with you. I am writing 
to find out if this is your valid email. Please, let me know if this 
email is valid

Kind regards
Adrien Saif
Attorney to Quatif Group of Companies

^ permalink raw reply

* Re: [PATCH 4/4] cpsw: add switchdev support
From: Ilias Apalodimas @ 2018-06-02 10:34 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar,
	jiri, ivecera, francois.ozog, yogeshs, spatton
In-Reply-To: <aa95cdbd-1e89-0b8e-352e-4de80bf0f714@gmail.com>

Hi Florian, 

Thanks for taking time to look into this

On Fri, Jun 01, 2018 at 02:48:48PM -0700, Florian Fainelli wrote:
> 
> 
> On 05/24/2018 09:56 PM, Ilias Apalodimas wrote:
> > On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote:
> >> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote:
> >>> On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote:
> >>>> Device tree is supposed to describe the hardware. Using that hardware
> >>>> in different ways is not something you should describe in DT.
> >>>>
> >>> The new switchdev mode is applied with a .config option in the kernel. What you
> >>> see is pre-existing code, so i am not sure if i should change it in this
> >>> patchset.
> >>
> >> If you break the code up into a library and two drivers, it becomes a
> >> moot point.
> > Agree
> > 
> >>
> >> But what i don't like here is that the device tree says to do dual
> >> mac. But you ignore that and do sometime else. I would prefer that if
> >> DT says dual mac, and switchdev is compiled in, the probe fails with
> >> EINVAL. Rather than ignore something, make it clear it is invalid.
> > The switch has 3 modes of operation as is.
> > 1. switch mode, to enable that you don't need to add anything on
> > the DTS and linux registers a single netdev interface.
> > 2. dual mac mode, this is when you need to add dual_emac; on the DTS.
> > 3. switchdev mode which is controlled by a .config option, since as you 
> > pointed out DTS was not made for controlling config options. 
> > 
> > I agree that this is far from beautiful. If the driver remains as in though,
> > i'd prefer either keeping what's there or making "switchdev" a DTS option, 
> > following the pre-existing erroneous usage rather than making the device 
> > unusable.  If we end up returning some error and refuse to initialize, users 
> > that remote upgrade their equipment, without taking a good look at changelog,
> > will loose access to their devices with no means of remotely fixing that.
> 
> It seems to me that the mistake here is seeing multiple modes of
> operations for the cpsw. There are not actually many, there is one
> usage, and then there is what you can and cannot offload. 
CPSW has in fact 2 modes of operation, different FIFO usage/lookup entry(it's
called ALE in the current driver) by-pass(which is used in dual emac for 
example) and other features. Again Grygorii is better suited to answer the 
exact differences.
> The basic
> premise with switchdev and DSA (which uses switchdev) is that each
> user-facing port of your switch needs to work as if it were a normal
> Ethernet NIC, that is what you call dual-MAC I believe. Then, when you
> create a bridge and you enslave those ports into the bridge, you need to
> have forwarding done in hardware between these two ports when the
> SMAC/DMAC are not for the host/CPU/management interface and you must
> simultaneously still have the host have the ability to send/receive
> traffic through the bridge device.
Yes dual emac does that. But dual emac configures the port facing VLAN to the
CPU port as well. So dual emac splits and uses 2 interfaces. VLAN 1 is
configured on port1 + CPU port and VLAN 2 is confired on port 2 + CPU port
That's exactly what the current RFC does as well, with the addition of
registering a sw0p0 (i'll explain why later on this mail)
A little more detail on the issue we are having. On my description 
sw0p0 -> CPU port, sw0p1 -> port 1 sw0p2 -> port 2. sw0p1/sw0p2 are the ports
that have PHYs attached. 

When we start in the new switchdev mode all interfaces are added to VLAN 0
so CPU port + port1 + port2 are all in the same VLAN group. In that case sw0p1
and sw0p2 are working as you describe. So those 2 interfaces can send/receive
traffic normally which matches the switchdev case.

When we add them on a bridge(let's say br0), VLAN1(or any default bridge VLAN)
is now configured on sw0p1 and sw0p2 but *not* on the CPU port. 
>From this point on the whole fuunctionality just collapses. The switch will 
work and offload traffic between sw0p1/sw0p2 but the bridge won't be able to 
get an ip address (since VLAN1 is not a member of the CPU port and the packet 
gets dropped). 
IGMPv2/V3 messages will never reach the br_multicast.c code to trigger 
switchdev and configure the MDBs on the ports.  i am prety sure there are other
fail scenarios which i haven't discovered already, but those 2 are the most 
basic ones.  If we add VLAN1 on the CPU port, everything works as intended of 
course.

That's the reason we registered sw0p0 as the CPU port. It can't do any "real"
traffic, but you can configure the CPU port independantly and not be forced to
do an OR on every VLAN add/delete grouping the CPU port with your port command.
The TL;DR version of this is that the switch is working exactly as switchdev is
expecting offloading traffic to the hardware when possible as long as the CPU
port is member of the proper VLANs

Petr's patch solves this for us (9c86ce2c1ae337fc10568a12aea812ed03de8319).
We can now do "bridge vlan add dev br0 vid 100 pvid untagged self" and decide
when to add the CPU port or not. 

There are still a couple of cases that are not covered though, if we don't 
register the CPU port. We cant decide when to forward multicast
traffic on the CPU port if a join hasn't been sent from br0.
So let's say you got 2 hosts doing multicast and for whatever reason the host
wants to see that traffic. 
With the CPU port present you can do a 
"bridge mdb add dev br0 port sw0p0 grp 239.1.1.1 permanent" which will offload
the traffic to the CPU port and thus the host. If this goes away we are forced
to send a join.
 
> It seems to me like this is entirely doable given that the dual MAC use
> case is supported already.
> 
> switchdev is just a stateless framework to get notified from the
> networking stack about what you can possibly offload in hardware, so
> having a DTS option gate that is unfortunately wrong because it is
> really implementing a SW policy in DTS which is not what it is meant for.
The DTS option for configuration pre-existed, i don't like that either switchdev
mode is activated by a .config option not DTS(it just overrides whatever config 
you have on the DTS). Far from pretty though fair point, i am open to 
suggestions.
> -- 
> Florian

Thanks!
Ilias

^ permalink raw reply

* Re: [net-next][PATCH] tcp: probe timer MUST not less than 5 minuter for tcp PMTU
From: Eric Dumazet @ 2018-06-02 10:19 UTC (permalink / raw)
  To: Li RongQing, netdev
In-Reply-To: <1527851039-6626-1-git-send-email-lirongqing@baidu.com>



On 06/01/2018 07:03 AM, Li RongQing wrote:
> RFC4821 say: The value for this timer MUST NOT be less than
> 5 minutes and is recommended to be 10 minutes, per RFC 1981.
> 
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
>  net/ipv4/sysctl_net_ipv4.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index d2eed3ddcb0a..ed8952bb6874 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -47,6 +47,7 @@ static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
>  static int ip_ping_group_range_min[] = { 0, 0 };
>  static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
>  static int comp_sack_nr_max = 255;
> +static int tcp_probe_interval_min = 300;
>  
>  /* obsolete */
>  static int sysctl_tcp_low_latency __read_mostly;
> @@ -711,7 +712,8 @@ static struct ctl_table ipv4_net_table[] = {
>  		.data		= &init_net.ipv4.sysctl_tcp_probe_interval,
>  		.maxlen		= sizeof(int),
>  		.mode		= 0644,
> -		.proc_handler	= proc_dointvec,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= &tcp_probe_interval_min,
>  	},
>  	{
>  		.procname	= "igmp_link_local_mcast_reports",
> 

Note that this change would stop people from being able to have packetdrill
tests which would run in a reasonable amount of time.

I do not believe linux kernel must enforce such a limit.

It is up to the admin to set a value here really, depending on the environment
the host is running in.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox