Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 1/4] net: dsa: mv88e6xxx: add support for MV88E6220
From: Andrew Lunn @ 2019-07-30 13:36 UTC (permalink / raw)
  To: Hubert Feurstein
  Cc: netdev, linux-kernel, Vivien Didelot, Florian Fainelli,
	David S. Miller, Rasmus Villemoes
In-Reply-To: <20190730100429.32479-2-h.feurstein@gmail.com>

On Tue, Jul 30, 2019 at 12:04:26PM +0200, Hubert Feurstein wrote:
> The MV88E6220 is almost the same as MV88E6250 except that the ports 2-4 are
> not routed to pins. So the usable ports are 0, 1, 5 and 6.
> 
> Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/chip.c | 25 +++++++++++++++++++++++++
>  drivers/net/dsa/mv88e6xxx/chip.h |  3 ++-
>  drivers/net/dsa/mv88e6xxx/port.h |  1 +
>  3 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 6b17cd961d06..c4982ced908e 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -4283,6 +4283,31 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
>  		.ops = &mv88e6240_ops,
>  	},

Hi Hubert

We try to keep all these lists in strict numerical order. Please can
you add 6220 before 6240, in all the places you have added it.

    Thanks
	Andrew

^ permalink raw reply

* Re: DSA Rate Limiting in 88E6390
From: Andrew Lunn @ 2019-07-30 13:31 UTC (permalink / raw)
  To: Benjamin Beckmeyer; +Cc: netdev
In-Reply-To: <fd08b6b3-3170-bf44-2f05-a0dd92ea868d@eks-engel.de>

> Hi Andrew,
> I've searched the netdev mailing list for DSA and traffic, but can't find anything
> about rate limiting till 2016. Do you have a hint, how I can find it?

I will try to find it later today.
 
> Do you know if the patchset was for Marvell or maybe for another company?

It was another switch vendor.

   Andrew

^ permalink raw reply

* Re: [PATCH] net: phy: fixed_phy: print gpio error only if gpio node is present
From: Andrew Lunn @ 2019-07-30 13:30 UTC (permalink / raw)
  To: Hubert Feurstein
  Cc: netdev, linux-kernel, Florian Fainelli, Heiner Kallweit,
	David S. Miller
In-Reply-To: <20190730094623.31640-1-h.feurstein@gmail.com>

On Tue, Jul 30, 2019 at 11:46:23AM +0200, Hubert Feurstein wrote:
> It is perfectly ok to not have an gpio attached to the fixed-link node. So
> the driver should not throw an error message when the gpio is missing.
> 
> Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com>

Fixes: 5468e82f7034 ("net: phy: fixed-phy: Drop GPIO from fixed_phy_add()")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* RE: [PATCH] net/socket: fix GCC8+ Wpacked-not-aligned warnings
From: David Laight @ 2019-07-30 13:23 UTC (permalink / raw)
  To: 'Qian Cai', davem@davemloft.net
  Cc: vyasevich@gmail.com, nhorman@tuxdriver.com,
	marcelo.leitner@gmail.com, linux-sctp@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <1564492704.11067.28.camel@lca.pw>

From: Qian Cai 
> Sent: 30 July 2019 14:18
> On Tue, 2019-07-30 at 09:01 +0000, David Laight wrote:
> > From: Qian Cai
> > > Sent: 29 July 2019 21:24
> >
> > ..
> > > To fix this, "struct sockaddr_storage" needs to be aligned to 4-byte as
> > > it is only used in those packed sctp structure which is part of UAPI,
> > > and "struct __kernel_sockaddr_storage" is used in some other
> > > places of UAPI that need not to change alignments in order to not
> > > breaking userspace.
> > >
> > > One option is use typedef between "sockaddr_storage" and
> > > "__kernel_sockaddr_storage" but it needs to change 35 or 370 lines of
> > > codes. The other option is to duplicate this simple 2-field structure to
> > > have a separate "struct sockaddr_storage" so it can use a different
> > > alignment than "__kernel_sockaddr_storage". Also the structure seems
> > > stable enough, so it will be low-maintenance to update two structures in
> > > the future in case of changes.
> >
> > Does it all work if the 8 byte alignment is implicit, not explicit?
> > For instance if unnamed union and struct are used eg:
> >
> > struct sockaddr_storage {
> > 	union {
> > 		void * __ptr;  /* Force alignment */
> > 		struct {
> > 			__kernel_sa_family_t	ss_family;		/* address family */
> > 			/* Following field(s) are implementation specific */
> > 			char	__data[_K_SS_MAXSIZE - sizeof(unsigned short)];
> > 					/* space to achieve desired size, */
> > 					/* _SS_MAXSIZE value minus size of ss_family */
> > 		};
> > 	};
> > };
> >
> > I suspect unnamed unions and structs have to be allowed by the compiler.
> 
> I believe this will suffer the same problem in that will break UAPI,
> 
> https://lore.kernel.org/lkml/20190726213045.GL6204@localhost.localdomain/

You are missing the bit where the UAPI structure is packed.
If the compiler won't let you 'pack' a structure that contains structures
(rather than just integers) then the compiler is broken!

The hope here was that it would be ok is the alignment was implicit not explicit.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply

* Re: [PATCH net-next v4 2/4] enetc: Add mdio bus driver for the PCIe MDIO endpoint
From: Andrew Lunn @ 2019-07-30 13:22 UTC (permalink / raw)
  To: Claudiu Manoil
  Cc: David S . Miller, Rob Herring, Li Yang, alexandru.marginean,
	netdev, devicetree, linux-arm-kernel, linux-kernel
In-Reply-To: <1564479919-18835-3-git-send-email-claudiu.manoil@nxp.com>

On Tue, Jul 30, 2019 at 12:45:17PM +0300, Claudiu Manoil wrote:
> ENETC ports can manage the MDIO bus via local register
> interface.  However there's also a centralized way
> to manage the MDIO bus, via the MDIO PCIe endpoint
> device integrated by the same root complex that also
> integrates the ENETC ports (eth controllers).
> 
> Depending on board design and use case, centralized
> access to MDIO may be better than using local ENETC
> port registers.  For instance, on the LS1028A QDS board
> where MDIO muxing is required.  Also, the LS1028A on-chip
> switch doesn't have a local MDIO register interface.
> 
> The current patch registers the above PCIe endpoint as a
> separate MDIO bus and provides a driver for it by re-using
> the code used for local MDIO access.  It also allows the
> ENETC port PHYs to be managed by this driver if the local
> "mdio" node is missing from the ENETC port node.
> 
> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH] net/socket: fix GCC8+ Wpacked-not-aligned warnings
From: Qian Cai @ 2019-07-30 13:18 UTC (permalink / raw)
  To: David Laight, davem@davemloft.net
  Cc: vyasevich@gmail.com, nhorman@tuxdriver.com,
	marcelo.leitner@gmail.com, linux-sctp@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <91237fd501de4ab895305c4d5666d822@AcuMS.aculab.com>

On Tue, 2019-07-30 at 09:01 +0000, David Laight wrote:
> From: Qian Cai
> > Sent: 29 July 2019 21:24
> 
> ..
> > To fix this, "struct sockaddr_storage" needs to be aligned to 4-byte as
> > it is only used in those packed sctp structure which is part of UAPI,
> > and "struct __kernel_sockaddr_storage" is used in some other
> > places of UAPI that need not to change alignments in order to not
> > breaking userspace.
> > 
> > One option is use typedef between "sockaddr_storage" and
> > "__kernel_sockaddr_storage" but it needs to change 35 or 370 lines of
> > codes. The other option is to duplicate this simple 2-field structure to
> > have a separate "struct sockaddr_storage" so it can use a different
> > alignment than "__kernel_sockaddr_storage". Also the structure seems
> > stable enough, so it will be low-maintenance to update two structures in
> > the future in case of changes.
> 
> Does it all work if the 8 byte alignment is implicit, not explicit?
> For instance if unnamed union and struct are used eg:
> 
> struct sockaddr_storage {
> 	union {
> 		void * __ptr;  /* Force alignment */
> 		struct {
> 			__kernel_sa_family_t	ss_family;		/*
> address family */
> 			/* Following field(s) are implementation specific */
> 			char	__data[_K_SS_MAXSIZE - sizeof(unsigned
> short)];
> 					/* space to achieve desired size, */
> 					/* _SS_MAXSIZE value minus size of
> ss_family */
> 		};
> 	};
> };
> 
> I suspect unnamed unions and structs have to be allowed by the compiler.

I believe this will suffer the same problem in that will break UAPI,

https://lore.kernel.org/lkml/20190726213045.GL6204@localhost.localdomain/

^ permalink raw reply

* Re: [PATCH net-next 2/2] net: phy: broadcom: add 1000Base-X support for BCM54616S
From: Andrew Lunn @ 2019-07-30 13:17 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Tao Ren, Florian Fainelli, Heiner Kallweit, David S . Miller,
	Arun Parameswaran, Justin Chen, netdev, lkml, Andrew Jeffery,
	openbmc@lists.ozlabs.org
In-Reply-To: <CA+h21ho1KOGS3WsNBHzfHkpSyE4k5HTE1tV9wUtnkZhjUZGeUw@mail.gmail.com>

> Again, I don't think Linux has generic support for overwriting (or
> even describing) the operating mode of a PHY, although maybe that's a
> direction we would want to push the discussion towards. RGMII to
> copper, RGMII to fiber, SGMII to copper, copper to fiber (media
> converter), even RGMII to SGMII (RTL8211FS supports this) - lots of
> modes, and this is only for gigabit PHYs...

This is something Russell King has PHYLINK patches for, which have not
yet been merged. There are some boards which use a PHY as a media
converter, placed between the MAC and an SFP.

	   Andrew

^ permalink raw reply

* [PATCH] net: usb: pegasus: fix improper read if get_registers() fail
From: Denis Kirjanov @ 2019-07-30 13:13 UTC (permalink / raw)
  To: petkan, davem; +Cc: netdev, Denis Kirjanov

get_registers() may fail with -ENOMEM and in this
case we can read a garbage from the status variable tmp.

Reported-by: syzbot+3499a83b2d062ae409d4@syzkaller.appspotmail.com
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
---
 drivers/net/usb/pegasus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
index 6d25dea5ad4b..f7d117d80cfb 100644
--- a/drivers/net/usb/pegasus.c
+++ b/drivers/net/usb/pegasus.c
@@ -282,7 +282,7 @@ static void mdio_write(struct net_device *dev, int phy_id, int loc, int val)
 static int read_eprom_word(pegasus_t *pegasus, __u8 index, __u16 *retdata)
 {
 	int i;
-	__u8 tmp;
+	__u8 tmp = 0;
 	__le16 retdatai;
 	int ret;
 
-- 
2.12.3


^ permalink raw reply related

* [PATCH] net: usb: pegasus: fix improper read if get_registers() fail
From: Denis Kirjanov @ 2019-07-30 12:45 UTC (permalink / raw)
  To: petkan, davem; +Cc: netdev, Denis Kirjanov

get_registers() may fail with -ENOMEM and in this
case we can read a garbage from the status variable tmp.

Reported-by: syzbot+3499a83b2d062ae409d4@syzkaller.appspotmail.com
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
---
 drivers/net/usb/pegasus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
index 6d25dea5ad4b..f7d117d80cfb 100644
--- a/drivers/net/usb/pegasus.c
+++ b/drivers/net/usb/pegasus.c
@@ -282,7 +282,7 @@ static void mdio_write(struct net_device *dev, int phy_id, int loc, int val)
 static int read_eprom_word(pegasus_t *pegasus, __u8 index, __u16 *retdata)
 {
 	int i;
-	__u8 tmp;
+	__u8 tmp = 0;
 	__le16 retdatai;
 	int ret;
 
-- 
2.12.3


^ permalink raw reply related

* Re: [PATCH] bridge:fragmented packets dropped by bridge
From: Nikolay Aleksandrov @ 2019-07-30 12:41 UTC (permalink / raw)
  To: Rundong Ge, davem
  Cc: kuznet, yoshfuji, netdev, pablo, kadlec, fw, roopa,
	netfilter-devel, coreteam, bridge, linux-kernel
In-Reply-To: <20190730122534.30687-1-rdong.ge@gmail.com>

On 30/07/2019 15:25, Rundong Ge wrote:
> Given following setup:
> -modprobe br_netfilter
> -echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
> -brctl addbr br0
> -brctl addif br0 enp2s0
> -brctl addif br0 enp3s0
> -brctl addif br0 enp6s0
> -ifconfig enp2s0 mtu 1300
> -ifconfig enp3s0 mtu 1500
> -ifconfig enp6s0 mtu 1500
> -ifconfig br0 up
> 
>                  multi-port
> mtu1500 - mtu1500|bridge|1500 - mtu1500
>   A                  |            B
>                    mtu1300
> 
> With netfilter defragmentation/conntrack enabled, fragmented
> packets from A will be defragmented in prerouting, and refragmented
> at postrouting.
> But in this scenario the bridge found the frag_max_size(1500) is
> larger than the dst mtu stored in the fake_rtable whitch is
> always equal to the bridge's mtu 1300, then packets will be dopped.
> 
> This modifies ip_skb_dst_mtu to use the out dev's mtu instead
> of bridge's mtu in bridge refragment.
> 
> Signed-off-by: Rundong Ge <rdong.ge@gmail.com>
> ---
>  include/net/ip.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/net/ip.h b/include/net/ip.h
> index 29d89de..0512de3 100644
> --- a/include/net/ip.h
> +++ b/include/net/ip.h
> @@ -450,6 +450,8 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
>  static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
>  					  const struct sk_buff *skb)
>  {
> +	if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
> +		return min(skb->dev->mtu, IP_MAX_MTU);
>  	if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
>  		bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;
>  
> 

I don't think this is correct, there's a reason why the bridge chooses the smallest
possible MTU out of its members and this is simply a hack to circumvent it.
If you really like to do so just set the bridge MTU manually, we've added support
so it won't change automatically to the smallest, but then how do you pass packets
1500 -> 1300 in this setup ?

You're talking about the frag_size check in br_nf_ip_fragment(), right ?


^ permalink raw reply

* [PATCHv2 net-next 5/5] sctp: factor out sctp_connect_add_peer
From: Xin Long @ 2019-07-30 12:38 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem
In-Reply-To: <cover.1564490276.git.lucien.xin@gmail.com>

In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it adds a peer with the other addr into the
asoc after this asoc is created with the 1st addr.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/socket.c | 76 +++++++++++++++++++++++--------------------------------
 1 file changed, 31 insertions(+), 45 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 6f77853..2f7e88c 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1111,6 +1111,33 @@ static int sctp_connect_new_asoc(struct sctp_endpoint *ep,
 	return err;
 }
 
+static int sctp_connect_add_peer(struct sctp_association *asoc,
+				 union sctp_addr *daddr, int addr_len)
+{
+	struct sctp_endpoint *ep = asoc->ep;
+	struct sctp_association *old;
+	struct sctp_transport *t;
+	int err;
+
+	err = sctp_verify_addr(ep->base.sk, daddr, addr_len);
+	if (err)
+		return err;
+
+	old = sctp_endpoint_lookup_assoc(ep, daddr, &t);
+	if (old && old != asoc)
+		return old->state >= SCTP_STATE_ESTABLISHED ? -EISCONN
+							    : -EALREADY;
+
+	if (sctp_endpoint_is_peeled_off(ep, daddr))
+		return -EADDRNOTAVAIL;
+
+	t = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL, SCTP_UNKNOWN);
+	if (!t)
+		return -ENOMEM;
+
+	return 0;
+}
+
 /* __sctp_connect(struct sock* sk, struct sockaddr *kaddrs, int addrs_size)
  *
  * Common routine for handling connect() and sctp_connectx().
@@ -1119,10 +1146,10 @@ static int sctp_connect_new_asoc(struct sctp_endpoint *ep,
 static int __sctp_connect(struct sock *sk, struct sockaddr *kaddrs,
 			  int addrs_size, int flags, sctp_assoc_t *assoc_id)
 {
-	struct sctp_association *old, *asoc;
 	struct sctp_sock *sp = sctp_sk(sk);
 	struct sctp_endpoint *ep = sp->ep;
 	struct sctp_transport *transport;
+	struct sctp_association *asoc;
 	void *addr_buf = kaddrs;
 	union sctp_addr *daddr;
 	struct sctp_af *af;
@@ -1167,29 +1194,10 @@ static int __sctp_connect(struct sock *sk, struct sockaddr *kaddrs,
 		if (asoc->peer.port != ntohs(daddr->v4.sin_port))
 			goto out_free;
 
-		err = sctp_verify_addr(sk, daddr, af->sockaddr_len);
+		err = sctp_connect_add_peer(asoc, daddr, af->sockaddr_len);
 		if (err)
 			goto out_free;
 
-		old = sctp_endpoint_lookup_assoc(ep, daddr, &transport);
-		if (old && old != asoc) {
-			err = old->state >= SCTP_STATE_ESTABLISHED ? -EISCONN
-								   : -EALREADY;
-			goto out_free;
-		}
-
-		if (sctp_endpoint_is_peeled_off(ep, daddr)) {
-			err = -EADDRNOTAVAIL;
-			goto out_free;
-		}
-
-		transport = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL,
-						SCTP_UNKNOWN);
-		if (!transport) {
-			err = -ENOMEM;
-			goto out_free;
-		}
-
 		addr_buf  += af->sockaddr_len;
 		walk_size += af->sockaddr_len;
 	}
@@ -1683,8 +1691,6 @@ static int sctp_sendmsg_new_asoc(struct sock *sk, __u16 sflags,
 
 	/* sendv addr list parse */
 	for_each_cmsghdr(cmsg, cmsgs->addrs_msg) {
-		struct sctp_transport *transport;
-		struct sctp_association *old;
 		union sctp_addr _daddr;
 		int dlen;
 
@@ -1718,30 +1724,10 @@ static int sctp_sendmsg_new_asoc(struct sock *sk, __u16 sflags,
 			daddr->v6.sin6_port = htons(asoc->peer.port);
 			memcpy(&daddr->v6.sin6_addr, CMSG_DATA(cmsg), dlen);
 		}
-		err = sctp_verify_addr(sk, daddr, sizeof(*daddr));
-		if (err)
-			goto free;
-
-		old = sctp_endpoint_lookup_assoc(ep, daddr, &transport);
-		if (old && old != asoc) {
-			if (old->state >= SCTP_STATE_ESTABLISHED)
-				err = -EISCONN;
-			else
-				err = -EALREADY;
-			goto free;
-		}
 
-		if (sctp_endpoint_is_peeled_off(ep, daddr)) {
-			err = -EADDRNOTAVAIL;
-			goto free;
-		}
-
-		transport = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL,
-						SCTP_UNKNOWN);
-		if (!transport) {
-			err = -ENOMEM;
+		err = sctp_connect_add_peer(asoc, daddr, sizeof(*daddr));
+		if (err)
 			goto free;
-		}
 	}
 
 	return 0;
-- 
2.1.0


^ permalink raw reply related

* [PATCHv2 net-next 4/5] sctp: factor out sctp_connect_new_asoc
From: Xin Long @ 2019-07-30 12:38 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem
In-Reply-To: <cover.1564490276.git.lucien.xin@gmail.com>

In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it creates the asoc and adds a peer with the
1st addr.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/socket.c | 160 ++++++++++++++++++++++++++----------------------------
 1 file changed, 76 insertions(+), 84 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index b9804e5..6f77853 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1044,6 +1044,73 @@ static int sctp_setsockopt_bindx(struct sock *sk,
 	return err;
 }
 
+static int sctp_connect_new_asoc(struct sctp_endpoint *ep,
+				 const union sctp_addr *daddr,
+				 const struct sctp_initmsg *init,
+				 struct sctp_transport **tp)
+{
+	struct sctp_association *asoc;
+	struct sock *sk = ep->base.sk;
+	struct net *net = sock_net(sk);
+	enum sctp_scope scope;
+	int err;
+
+	if (sctp_endpoint_is_peeled_off(ep, daddr))
+		return -EADDRNOTAVAIL;
+
+	if (!ep->base.bind_addr.port) {
+		if (sctp_autobind(sk))
+			return -EAGAIN;
+	} else {
+		if (ep->base.bind_addr.port < inet_prot_sock(net) &&
+		    !ns_capable(net->user_ns, CAP_NET_BIND_SERVICE))
+			return -EACCES;
+	}
+
+	scope = sctp_scope(daddr);
+	asoc = sctp_association_new(ep, sk, scope, GFP_KERNEL);
+	if (!asoc)
+		return -ENOMEM;
+
+	err = sctp_assoc_set_bind_addr_from_ep(asoc, scope, GFP_KERNEL);
+	if (err < 0)
+		goto free;
+
+	*tp = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL, SCTP_UNKNOWN);
+	if (!*tp) {
+		err = -ENOMEM;
+		goto free;
+	}
+
+	if (!init)
+		return 0;
+
+	if (init->sinit_num_ostreams) {
+		__u16 outcnt = init->sinit_num_ostreams;
+
+		asoc->c.sinit_num_ostreams = outcnt;
+		/* outcnt has been changed, need to re-init stream */
+		err = sctp_stream_init(&asoc->stream, outcnt, 0, GFP_KERNEL);
+		if (err)
+			goto free;
+	}
+
+	if (init->sinit_max_instreams)
+		asoc->c.sinit_max_instreams = init->sinit_max_instreams;
+
+	if (init->sinit_max_attempts)
+		asoc->max_init_attempts = init->sinit_max_attempts;
+
+	if (init->sinit_max_init_timeo)
+		asoc->max_init_timeo =
+			msecs_to_jiffies(init->sinit_max_init_timeo);
+
+	return 0;
+free:
+	sctp_association_free(asoc);
+	return err;
+}
+
 /* __sctp_connect(struct sock* sk, struct sockaddr *kaddrs, int addrs_size)
  *
  * Common routine for handling connect() and sctp_connectx().
@@ -1056,10 +1123,8 @@ static int __sctp_connect(struct sock *sk, struct sockaddr *kaddrs,
 	struct sctp_sock *sp = sctp_sk(sk);
 	struct sctp_endpoint *ep = sp->ep;
 	struct sctp_transport *transport;
-	struct net *net = sock_net(sk);
 	void *addr_buf = kaddrs;
 	union sctp_addr *daddr;
-	enum sctp_scope scope;
 	struct sctp_af *af;
 	int walk_size, err;
 	long timeo;
@@ -1082,32 +1147,10 @@ static int __sctp_connect(struct sock *sk, struct sockaddr *kaddrs,
 		return asoc->state >= SCTP_STATE_ESTABLISHED ? -EISCONN
 							     : -EALREADY;
 
-	if (sctp_endpoint_is_peeled_off(ep, daddr))
-		return -EADDRNOTAVAIL;
-
-	if (!ep->base.bind_addr.port) {
-		if (sctp_autobind(sk))
-			return -EAGAIN;
-	} else {
-		if (ep->base.bind_addr.port < inet_prot_sock(net) &&
-		    !ns_capable(net->user_ns, CAP_NET_BIND_SERVICE))
-			return -EACCES;
-	}
-
-	scope = sctp_scope(daddr);
-	asoc = sctp_association_new(ep, sk, scope, GFP_KERNEL);
-	if (!asoc)
-		return -ENOMEM;
-
-	err = sctp_assoc_set_bind_addr_from_ep(asoc, scope, GFP_KERNEL);
-	if (err < 0)
-		goto out_free;
-
-	transport = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL, SCTP_UNKNOWN);
-	if (!transport) {
-		err = -ENOMEM;
-		goto out_free;
-	}
+	err = sctp_connect_new_asoc(ep, daddr, NULL, &transport);
+	if (err)
+		return err;
+	asoc = transport->asoc;
 
 	addr_buf += af->sockaddr_len;
 	walk_size = af->sockaddr_len;
@@ -1160,7 +1203,7 @@ static int __sctp_connect(struct sock *sk, struct sockaddr *kaddrs,
 			goto out_free;
 	}
 
-	err = sctp_primitive_ASSOCIATE(net, asoc, NULL);
+	err = sctp_primitive_ASSOCIATE(sock_net(sk), asoc, NULL);
 	if (err < 0)
 		goto out_free;
 
@@ -1597,9 +1640,7 @@ static int sctp_sendmsg_new_asoc(struct sock *sk, __u16 sflags,
 				 struct sctp_transport **tp)
 {
 	struct sctp_endpoint *ep = sctp_sk(sk)->ep;
-	struct net *net = sock_net(sk);
 	struct sctp_association *asoc;
-	enum sctp_scope scope;
 	struct cmsghdr *cmsg;
 	__be32 flowinfo = 0;
 	struct sctp_af *af;
@@ -1614,20 +1655,6 @@ static int sctp_sendmsg_new_asoc(struct sock *sk, __u16 sflags,
 				    sctp_sstate(sk, CLOSING)))
 		return -EADDRNOTAVAIL;
 
-	if (sctp_endpoint_is_peeled_off(ep, daddr))
-		return -EADDRNOTAVAIL;
-
-	if (!ep->base.bind_addr.port) {
-		if (sctp_autobind(sk))
-			return -EAGAIN;
-	} else {
-		if (ep->base.bind_addr.port < inet_prot_sock(net) &&
-		    !ns_capable(net->user_ns, CAP_NET_BIND_SERVICE))
-			return -EACCES;
-	}
-
-	scope = sctp_scope(daddr);
-
 	/* Label connection socket for first association 1-to-many
 	 * style for client sequence socket()->sendmsg(). This
 	 * needs to be done before sctp_assoc_add_peer() as that will
@@ -1643,45 +1670,10 @@ static int sctp_sendmsg_new_asoc(struct sock *sk, __u16 sflags,
 	if (err < 0)
 		return err;
 
-	asoc = sctp_association_new(ep, sk, scope, GFP_KERNEL);
-	if (!asoc)
-		return -ENOMEM;
-
-	if (sctp_assoc_set_bind_addr_from_ep(asoc, scope, GFP_KERNEL) < 0) {
-		err = -ENOMEM;
-		goto free;
-	}
-
-	if (cmsgs->init) {
-		struct sctp_initmsg *init = cmsgs->init;
-
-		if (init->sinit_num_ostreams) {
-			__u16 outcnt = init->sinit_num_ostreams;
-
-			asoc->c.sinit_num_ostreams = outcnt;
-			/* outcnt has been changed, need to re-init stream */
-			err = sctp_stream_init(&asoc->stream, outcnt, 0,
-					       GFP_KERNEL);
-			if (err)
-				goto free;
-		}
-
-		if (init->sinit_max_instreams)
-			asoc->c.sinit_max_instreams = init->sinit_max_instreams;
-
-		if (init->sinit_max_attempts)
-			asoc->max_init_attempts = init->sinit_max_attempts;
-
-		if (init->sinit_max_init_timeo)
-			asoc->max_init_timeo =
-				msecs_to_jiffies(init->sinit_max_init_timeo);
-	}
-
-	*tp = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL, SCTP_UNKNOWN);
-	if (!*tp) {
-		err = -ENOMEM;
-		goto free;
-	}
+	err = sctp_connect_new_asoc(ep, daddr, cmsgs->init, tp);
+	if (err)
+		return err;
+	asoc = (*tp)->asoc;
 
 	if (!cmsgs->addrs_msg)
 		return 0;
-- 
2.1.0


^ permalink raw reply related

* [PATCHv2 net-next 3/5] sctp: clean up __sctp_connect
From: Xin Long @ 2019-07-30 12:38 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem
In-Reply-To: <cover.1564490276.git.lucien.xin@gmail.com>

__sctp_connect is doing quit similar things as sctp_sendmsg_new_asoc.
To factor out common functions, this patch is to clean up their code
to make them look more similar:

  1. create the asoc and add a peer with the 1st addr.
  2. add peers with the other addrs into this asoc one by one.

while at it, also remove the unused 'addrcnt'.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/socket.c | 209 +++++++++++++++++++-----------------------------------
 1 file changed, 73 insertions(+), 136 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index e9c5b39..b9804e5 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1049,153 +1049,105 @@ static int sctp_setsockopt_bindx(struct sock *sk,
  * Common routine for handling connect() and sctp_connectx().
  * Connect will come in with just a single address.
  */
-static int __sctp_connect(struct sock *sk,
-			  struct sockaddr *kaddrs,
-			  int addrs_size, int flags,
-			  sctp_assoc_t *assoc_id)
+static int __sctp_connect(struct sock *sk, struct sockaddr *kaddrs,
+			  int addrs_size, int flags, sctp_assoc_t *assoc_id)
 {
-	struct net *net = sock_net(sk);
-	struct sctp_sock *sp;
-	struct sctp_endpoint *ep;
-	struct sctp_association *asoc = NULL;
-	struct sctp_association *asoc2;
+	struct sctp_association *old, *asoc;
+	struct sctp_sock *sp = sctp_sk(sk);
+	struct sctp_endpoint *ep = sp->ep;
 	struct sctp_transport *transport;
-	union sctp_addr to;
+	struct net *net = sock_net(sk);
+	void *addr_buf = kaddrs;
+	union sctp_addr *daddr;
 	enum sctp_scope scope;
+	struct sctp_af *af;
+	int walk_size, err;
 	long timeo;
-	int err = 0;
-	int addrcnt = 0;
-	int walk_size = 0;
-	union sctp_addr *sa_addr = NULL;
-	void *addr_buf;
-	unsigned short port;
 
-	sp = sctp_sk(sk);
-	ep = sp->ep;
-
-	/* connect() cannot be done on a socket that is already in ESTABLISHED
-	 * state - UDP-style peeled off socket or a TCP-style socket that
-	 * is already connected.
-	 * It cannot be done even on a TCP-style listening socket.
-	 */
 	if (sctp_sstate(sk, ESTABLISHED) || sctp_sstate(sk, CLOSING) ||
-	    (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING))) {
-		err = -EISCONN;
-		goto out_free;
+	    (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING)))
+		return -EISCONN;
+
+	daddr = addr_buf;
+	af = sctp_get_af_specific(daddr->sa.sa_family);
+	if (!af || af->sockaddr_len > addrs_size)
+		return -EINVAL;
+
+	err = sctp_verify_addr(sk, daddr, af->sockaddr_len);
+	if (err)
+		return err;
+
+	asoc = sctp_endpoint_lookup_assoc(ep, daddr, &transport);
+	if (asoc)
+		return asoc->state >= SCTP_STATE_ESTABLISHED ? -EISCONN
+							     : -EALREADY;
+
+	if (sctp_endpoint_is_peeled_off(ep, daddr))
+		return -EADDRNOTAVAIL;
+
+	if (!ep->base.bind_addr.port) {
+		if (sctp_autobind(sk))
+			return -EAGAIN;
+	} else {
+		if (ep->base.bind_addr.port < inet_prot_sock(net) &&
+		    !ns_capable(net->user_ns, CAP_NET_BIND_SERVICE))
+			return -EACCES;
 	}
 
-	/* Walk through the addrs buffer and count the number of addresses. */
-	addr_buf = kaddrs;
-	while (walk_size < addrs_size) {
-		struct sctp_af *af;
+	scope = sctp_scope(daddr);
+	asoc = sctp_association_new(ep, sk, scope, GFP_KERNEL);
+	if (!asoc)
+		return -ENOMEM;
 
-		if (walk_size + sizeof(sa_family_t) > addrs_size) {
-			err = -EINVAL;
-			goto out_free;
-		}
+	err = sctp_assoc_set_bind_addr_from_ep(asoc, scope, GFP_KERNEL);
+	if (err < 0)
+		goto out_free;
 
-		sa_addr = addr_buf;
-		af = sctp_get_af_specific(sa_addr->sa.sa_family);
+	transport = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL, SCTP_UNKNOWN);
+	if (!transport) {
+		err = -ENOMEM;
+		goto out_free;
+	}
 
-		/* If the address family is not supported or if this address
-		 * causes the address buffer to overflow return EINVAL.
-		 */
-		if (!af || (walk_size + af->sockaddr_len) > addrs_size) {
-			err = -EINVAL;
+	addr_buf += af->sockaddr_len;
+	walk_size = af->sockaddr_len;
+	while (walk_size < addrs_size) {
+		err = -EINVAL;
+		if (walk_size + sizeof(sa_family_t) > addrs_size)
 			goto out_free;
-		}
 
-		port = ntohs(sa_addr->v4.sin_port);
-
-		/* Save current address so we can work with it */
-		memcpy(&to, sa_addr, af->sockaddr_len);
+		daddr = addr_buf;
+		af = sctp_get_af_specific(daddr->sa.sa_family);
+		if (!af || af->sockaddr_len + walk_size > addrs_size)
+			goto out_free;
 
-		err = sctp_verify_addr(sk, &to, af->sockaddr_len);
-		if (err)
+		if (asoc->peer.port != ntohs(daddr->v4.sin_port))
 			goto out_free;
 
-		/* Make sure the destination port is correctly set
-		 * in all addresses.
-		 */
-		if (asoc && asoc->peer.port && asoc->peer.port != port) {
-			err = -EINVAL;
+		err = sctp_verify_addr(sk, daddr, af->sockaddr_len);
+		if (err)
 			goto out_free;
-		}
 
-		/* Check if there already is a matching association on the
-		 * endpoint (other than the one created here).
-		 */
-		asoc2 = sctp_endpoint_lookup_assoc(ep, &to, &transport);
-		if (asoc2 && asoc2 != asoc) {
-			if (asoc2->state >= SCTP_STATE_ESTABLISHED)
-				err = -EISCONN;
-			else
-				err = -EALREADY;
+		old = sctp_endpoint_lookup_assoc(ep, daddr, &transport);
+		if (old && old != asoc) {
+			err = old->state >= SCTP_STATE_ESTABLISHED ? -EISCONN
+								   : -EALREADY;
 			goto out_free;
 		}
 
-		/* If we could not find a matching association on the endpoint,
-		 * make sure that there is no peeled-off association matching
-		 * the peer address even on another socket.
-		 */
-		if (sctp_endpoint_is_peeled_off(ep, &to)) {
+		if (sctp_endpoint_is_peeled_off(ep, daddr)) {
 			err = -EADDRNOTAVAIL;
 			goto out_free;
 		}
 
-		if (!asoc) {
-			/* If a bind() or sctp_bindx() is not called prior to
-			 * an sctp_connectx() call, the system picks an
-			 * ephemeral port and will choose an address set
-			 * equivalent to binding with a wildcard address.
-			 */
-			if (!ep->base.bind_addr.port) {
-				if (sctp_autobind(sk)) {
-					err = -EAGAIN;
-					goto out_free;
-				}
-			} else {
-				/*
-				 * If an unprivileged user inherits a 1-many
-				 * style socket with open associations on a
-				 * privileged port, it MAY be permitted to
-				 * accept new associations, but it SHOULD NOT
-				 * be permitted to open new associations.
-				 */
-				if (ep->base.bind_addr.port <
-				    inet_prot_sock(net) &&
-				    !ns_capable(net->user_ns,
-				    CAP_NET_BIND_SERVICE)) {
-					err = -EACCES;
-					goto out_free;
-				}
-			}
-
-			scope = sctp_scope(&to);
-			asoc = sctp_association_new(ep, sk, scope, GFP_KERNEL);
-			if (!asoc) {
-				err = -ENOMEM;
-				goto out_free;
-			}
-
-			err = sctp_assoc_set_bind_addr_from_ep(asoc, scope,
-							      GFP_KERNEL);
-			if (err < 0) {
-				goto out_free;
-			}
-
-		}
-
-		/* Prime the peer's transport structures.  */
-		transport = sctp_assoc_add_peer(asoc, &to, GFP_KERNEL,
+		transport = sctp_assoc_add_peer(asoc, daddr, GFP_KERNEL,
 						SCTP_UNKNOWN);
 		if (!transport) {
 			err = -ENOMEM;
 			goto out_free;
 		}
 
-		addrcnt++;
-		addr_buf += af->sockaddr_len;
+		addr_buf  += af->sockaddr_len;
 		walk_size += af->sockaddr_len;
 	}
 
@@ -1209,39 +1161,24 @@ static int __sctp_connect(struct sock *sk,
 	}
 
 	err = sctp_primitive_ASSOCIATE(net, asoc, NULL);
-	if (err < 0) {
+	if (err < 0)
 		goto out_free;
-	}
 
 	/* Initialize sk's dport and daddr for getpeername() */
 	inet_sk(sk)->inet_dport = htons(asoc->peer.port);
-	sp->pf->to_sk_daddr(sa_addr, sk);
+	sp->pf->to_sk_daddr(daddr, sk);
 	sk->sk_err = 0;
 
-	timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
-
 	if (assoc_id)
 		*assoc_id = asoc->assoc_id;
 
-	err = sctp_wait_for_connect(asoc, &timeo);
-	/* Note: the asoc may be freed after the return of
-	 * sctp_wait_for_connect.
-	 */
-
-	/* Don't free association on exit. */
-	asoc = NULL;
+	timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
+	return sctp_wait_for_connect(asoc, &timeo);
 
 out_free:
 	pr_debug("%s: took out_free path with asoc:%p kaddrs:%p err:%d\n",
 		 __func__, asoc, kaddrs, err);
-
-	if (asoc) {
-		/* sctp_primitive_ASSOCIATE may have added this association
-		 * To the hash table, try to unhash it, just in case, its a noop
-		 * if it wasn't hashed so we're safe
-		 */
-		sctp_association_free(asoc);
-	}
+	sctp_association_free(asoc);
 	return err;
 }
 
-- 
2.1.0


^ permalink raw reply related

* [PATCHv2 net-next 2/5] sctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectx
From: Xin Long @ 2019-07-30 12:38 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem
In-Reply-To: <cover.1564490276.git.lucien.xin@gmail.com>

Now __sctp_connect() is called by __sctp_setsockopt_connectx() and
sctp_inet_connect(), the latter has done addr_size check with size
of sa_family_t.

In the next patch to clean up __sctp_connect(), we will remove
addr_size check with size of sa_family_t from __sctp_connect()
for the 1st address.

So before doing that, __sctp_setsockopt_connectx() should do
this check first, as sctp_inet_connect() does.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/socket.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index aa80cda..e9c5b39 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1311,7 +1311,8 @@ static int __sctp_setsockopt_connectx(struct sock *sk,
 	pr_debug("%s: sk:%p addrs:%p addrs_size:%d\n",
 		 __func__, sk, addrs, addrs_size);
 
-	if (unlikely(addrs_size <= 0))
+	/* make sure the 1st addr's sa_family is accessible later */
+	if (unlikely(addrs_size < sizeof(sa_family_t)))
 		return -EINVAL;
 
 	kaddrs = memdup_user(addrs, addrs_size);
-- 
2.1.0


^ permalink raw reply related

* [PATCHv2 net-next 1/5] sctp: only copy the available addr data in sctp_transport_init
From: Xin Long @ 2019-07-30 12:38 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem
In-Reply-To: <cover.1564490276.git.lucien.xin@gmail.com>

'addr' passed to sctp_transport_init is not always a whole size
of union sctp_addr, like the path:

  sctp_sendmsg() ->
  sctp_sendmsg_new_asoc() ->
  sctp_assoc_add_peer() ->
  sctp_transport_new() -> sctp_transport_init()

In the next patches, we will also pass the address length of data
only to sctp_assoc_add_peer().

So sctp_transport_init() should copy the only available data from
addr to peer->ipaddr, instead of 'peer->ipaddr = *addr' which may
cause slab-out-of-bounds.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/transport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index e2f8e36..7235a60 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -43,8 +43,8 @@ static struct sctp_transport *sctp_transport_init(struct net *net,
 						  gfp_t gfp)
 {
 	/* Copy in the address.  */
-	peer->ipaddr = *addr;
 	peer->af_specific = sctp_get_af_specific(addr->sa.sa_family);
+	memcpy(&peer->ipaddr, addr, peer->af_specific->sockaddr_len);
 	memset(&peer->saddr, 0, sizeof(union sctp_addr));
 
 	peer->sack_generation = 0;
-- 
2.1.0


^ permalink raw reply related

* [PATCHv2 net-next 0/5] sctp: clean up __sctp_connect function
From: Xin Long @ 2019-07-30 12:38 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: Marcelo Ricardo Leitner, Neil Horman, davem

This patchset is to factor out some common code for
sctp_sendmsg_new_asoc() and __sctp_connect() into 2
new functioins.

v1->v2:
  - add the patch 1/5 to avoid a slab-out-of-bounds warning.
  - add some code comment for the check change in patch 2/5.
  - remove unused 'addrcnt' as Marcelo noticed in patch 3/5.

Xin Long (5):
  sctp: only copy the available addr data in sctp_transport_init
  sctp: check addr_size with sa_family_t size in
    __sctp_setsockopt_connectx
  sctp: clean up __sctp_connect
  sctp: factor out sctp_connect_new_asoc
  sctp: factor out sctp_connect_add_peer

 net/sctp/socket.c    | 376 ++++++++++++++++++++-------------------------------
 net/sctp/transport.c |   2 +-
 2 files changed, 147 insertions(+), 231 deletions(-)

-- 
2.1.0


^ permalink raw reply

* Re: [PATCH] bridge:fragmented packets dropped by bridge
From: Florian Westphal @ 2019-07-30 12:35 UTC (permalink / raw)
  To: Rundong Ge
  Cc: davem, kuznet, yoshfuji, netdev, pablo, kadlec, fw, roopa,
	netfilter-devel, coreteam, bridge, nikolay, linux-kernel
In-Reply-To: <20190730122534.30687-1-rdong.ge@gmail.com>

Rundong Ge <rdong.ge@gmail.com> wrote:
> Given following setup:
> -modprobe br_netfilter
> -echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
> -brctl addbr br0
> -brctl addif br0 enp2s0
> -brctl addif br0 enp3s0
> -brctl addif br0 enp6s0
> -ifconfig enp2s0 mtu 1300
> -ifconfig enp3s0 mtu 1500
> -ifconfig enp6s0 mtu 1500
> -ifconfig br0 up
> 
>                  multi-port
> mtu1500 - mtu1500|bridge|1500 - mtu1500
>   A                  |            B
>                    mtu1300

How can a bridge forward a frame from A/B to mtu1300?

> With netfilter defragmentation/conntrack enabled, fragmented
> packets from A will be defragmented in prerouting, and refragmented
> at postrouting.

Yes, but I don't see how that relates to the problem at hand.

> But in this scenario the bridge found the frag_max_size(1500) is
> larger than the dst mtu stored in the fake_rtable whitch is
> always equal to the bridge's mtu 1300, then packets will be dopped.

What happens without netfilter or non-fragmented packets?

> This modifies ip_skb_dst_mtu to use the out dev's mtu instead
> of bridge's mtu in bridge refragment.

It seems quite a hack?  The above setup should use a router, not a bridge.

^ permalink raw reply

* Re: [PATCH] net: bridge: Allow bridge to joing multicast groups
From: Allan W. Nielsen @ 2019-07-30 12:19 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Nikolay Aleksandrov, Horatiu Vultur, roopa, davem, bridge, netdev,
	linux-kernel
In-Reply-To: <20190730100416.GA13250@splinter>

The 07/30/2019 13:04, Ido Schimmel wrote:
> On Tue, Jul 30, 2019 at 10:30:28AM +0200, Allan W. Nielsen wrote:
> > The 07/30/2019 10:06, Ido Schimmel wrote:
> > > As a bonus, existing drivers could benefit from it, as MDB entries are already
> > > notified by MAC.
> > Not sure I follow. When FDB entries are added, it also generates notification
> > events.
> 
> I meant the switchdev notification sent to drivers:
> 
> /* SWITCHDEV_OBJ_ID_PORT_MDB */
> struct switchdev_obj_port_mdb {
> 	struct switchdev_obj obj;
> 	unsigned char addr[ETH_ALEN];
> 	u16 vid;
> };
> 
> By extending MDB entries to also be keyed by MAC you basically get a lot
> of things for free without duplicating the same code for multicast FDBs.
Agree, this should be the same.

> AFAICS, then only change in the fast path is in br_mdb_get() where you
> need to use DMAC as key in case Ethertype is not IPv4/IPv6.
That would be nice.

-- 
/Allan

^ permalink raw reply

* [PATCH] bridge:fragmented packets dropped by bridge
From: Rundong Ge @ 2019-07-30 12:25 UTC (permalink / raw)
  To: davem
  Cc: kuznet, yoshfuji, netdev, pablo, kadlec, fw, roopa,
	netfilter-devel, coreteam, bridge, nikolay, linux-kernel,
	rdong.ge

Given following setup:
-modprobe br_netfilter
-echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
-brctl addbr br0
-brctl addif br0 enp2s0
-brctl addif br0 enp3s0
-brctl addif br0 enp6s0
-ifconfig enp2s0 mtu 1300
-ifconfig enp3s0 mtu 1500
-ifconfig enp6s0 mtu 1500
-ifconfig br0 up

                 multi-port
mtu1500 - mtu1500|bridge|1500 - mtu1500
  A                  |            B
                   mtu1300

With netfilter defragmentation/conntrack enabled, fragmented
packets from A will be defragmented in prerouting, and refragmented
at postrouting.
But in this scenario the bridge found the frag_max_size(1500) is
larger than the dst mtu stored in the fake_rtable whitch is
always equal to the bridge's mtu 1300, then packets will be dopped.

This modifies ip_skb_dst_mtu to use the out dev's mtu instead
of bridge's mtu in bridge refragment.

Signed-off-by: Rundong Ge <rdong.ge@gmail.com>
---
 include/net/ip.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/net/ip.h b/include/net/ip.h
index 29d89de..0512de3 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -450,6 +450,8 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
 static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
 					  const struct sk_buff *skb)
 {
+	if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
+		return min(skb->dev->mtu, IP_MAX_MTU);
 	if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
 		bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;
 
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH][net-next][V2] mlxsw: spectrum_ptp: fix duplicated check on orig_egr_types
From: Colin Ian King @ 2019-07-30 12:22 UTC (permalink / raw)
  To: Petr Machata, Jiri Pirko, Ido Schimmel, David S . Miller, netdev
  Cc: kernel-janitors, linux-kernel
In-Reply-To: <20190730114752.24133-1-colin.king@canonical.com>

As pointed out by Petr, this should be [net] and not [net-next]


On 30/07/2019 12:47, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> Currently are duplicated checks on orig_egr_types which are
> redundant, I believe this is a typo and should actually be
> orig_ing_types || orig_egr_types instead of the expression
> orig_egr_types || orig_egr_types.  Fix these.
> 
> Addresses-Coverity: ("Same on both sides")
> Fixes: c6b36bdd04b5 ("mlxsw: spectrum_ptp: Increase parsing depth when PTP is enabled")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
> index 98c5ba3200bc..63b07edd9d81 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
> @@ -999,14 +999,14 @@ static int mlxsw_sp1_ptp_mtpppc_update(struct mlxsw_sp_port *mlxsw_sp_port,
>  		}
>  	}
>  
> -	if ((ing_types || egr_types) && !(orig_egr_types || orig_egr_types)) {
> +	if ((ing_types || egr_types) && !(orig_ing_types || orig_egr_types)) {
>  		err = mlxsw_sp_nve_inc_parsing_depth_get(mlxsw_sp);
>  		if (err) {
>  			netdev_err(mlxsw_sp_port->dev, "Failed to increase parsing depth");
>  			return err;
>  		}
>  	}
> -	if (!(ing_types || egr_types) && (orig_egr_types || orig_egr_types))
> +	if (!(ing_types || egr_types) && (orig_ing_types || orig_egr_types))
>  		mlxsw_sp_nve_inc_parsing_depth_put(mlxsw_sp);
>  
>  	return mlxsw_sp1_ptp_mtpppc_set(mlxsw_sp_port->mlxsw_sp,
> 
> --
> 
> V2: fix both occurances of this typo. Thanks to Petr Machata for spotting
>     the 2nd occurrance
> 


^ permalink raw reply

* [PATCH net-next] net: bridge: mcast: add delete due to fast-leave mdb flag
From: Nikolay Aleksandrov @ 2019-07-30 12:20 UTC (permalink / raw)
  To: netdev; +Cc: davem, roopa, bridge, Nikolay Aleksandrov

In user-space there's no way to distinguish why an mdb entry was deleted
and that is a problem for daemons which would like to keep the mdb in
sync with remote ends (e.g. mlag) but would also like to converge faster.
In almost all cases we'd like to age-out the remote entry for performance
and convergence reasons except when fast-leave is enabled. In that case we
want explicit immediate remote delete, thus add mdb flag which is set only
when the entry is being deleted due to fast-leave.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
 include/uapi/linux/if_bridge.h | 1 +
 net/bridge/br_mdb.c            | 2 ++
 net/bridge/br_multicast.c      | 2 +-
 net/bridge/br_private.h        | 1 +
 4 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 773e476a8e54..1b3c2b643a02 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -237,6 +237,7 @@ struct br_mdb_entry {
 #define MDB_PERMANENT 1
 	__u8 state;
 #define MDB_FLAGS_OFFLOAD	(1 << 0)
+#define MDB_FLAGS_FAST_LEAVE	(1 << 1)
 	__u8 flags;
 	__u16 vid;
 	struct {
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index bf6acd34234d..428af1abf8cc 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -60,6 +60,8 @@ static void __mdb_entry_fill_flags(struct br_mdb_entry *e, unsigned char flags)
 	e->flags = 0;
 	if (flags & MDB_PG_FLAGS_OFFLOAD)
 		e->flags |= MDB_FLAGS_OFFLOAD;
+	if (flags & MDB_PG_FLAGS_FAST_LEAVE)
+		e->flags |= MDB_FLAGS_FAST_LEAVE;
 }
 
 static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip)
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 3d8deac2353d..3d4b2817687f 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1393,7 +1393,7 @@ br_multicast_leave_group(struct net_bridge *br,
 			del_timer(&p->timer);
 			kfree_rcu(p, rcu);
 			br_mdb_notify(br->dev, port, group, RTM_DELMDB,
-				      p->flags);
+				      p->flags | MDB_PG_FLAGS_FAST_LEAVE);
 
 			if (!mp->ports && !mp->host_joined &&
 			    netif_running(br->dev))
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e8cf03b43b7d..c4fd307fbfdc 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -199,6 +199,7 @@ struct net_bridge_fdb_entry {
 
 #define MDB_PG_FLAGS_PERMANENT	BIT(0)
 #define MDB_PG_FLAGS_OFFLOAD	BIT(1)
+#define MDB_PG_FLAGS_FAST_LEAVE	BIT(2)
 
 struct net_bridge_port_group {
 	struct net_bridge_port		*port;
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH][net-next][V2] mlxsw: spectrum_ptp: fix duplicated check on orig_egr_types
From: Petr Machata @ 2019-07-30 12:19 UTC (permalink / raw)
  To: Colin King
  Cc: Jiri Pirko, Ido Schimmel, David S . Miller,
	netdev@vger.kernel.org, kernel-janitors@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20190730114752.24133-1-colin.king@canonical.com>


Colin King <colin.king@canonical.com> writes:

> From: Colin Ian King <colin.king@canonical.com>
>
> Currently are duplicated checks on orig_egr_types which are
> redundant, I believe this is a typo and should actually be
> orig_ing_types || orig_egr_types instead of the expression
> orig_egr_types || orig_egr_types.  Fix these.
>
> Addresses-Coverity: ("Same on both sides")
> Fixes: c6b36bdd04b5 ("mlxsw: spectrum_ptp: Increase parsing depth when PTP is enabled")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Thank you, it looks good. But can you please direct this to "net", not
"net-next"?

^ permalink raw reply

* NETIF_F_LLTX breaks iwlwifi
From: Jean Delvare @ 2019-07-30 12:18 UTC (permalink / raw)
  To: Felix Fietkau, Toke Høiland-Jørgensen, Johannes Berg
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel

Hi Felix, Toke, Johannes,

After updating to kernel 5.2, I started losing wireless network on my
workstation a few minutes after boot. I could restart the network
service to get it back, but it would go away again a few minutes later.
No error message logged, but somehow the network traffic was no long
being processed.

My hardware is:

05:00.0 Network controller [0280]: Intel Corporation Wireless 8265 / 8275 [8086:24fd] (rev 78)

This is an Intel 8265 PCIe WiFI adapter by Gigabyte, model GC-WB867D-I,
which worked flawlessly for me until then.

I bisected it down to:

commit 8dbb000ee73be2c05e34756739ce308885312a29 (refs/bisect/bad)
Author: Felix Fietkau
Date:   Sat Mar 16 18:06:34 2019 +0100

    mac80211: set NETIF_F_LLTX when using intermediate tx queues

So whatever the commit message says, it is apparently not safe to run
TX handlers on multiple CPUs in parallel for this specific driver /
device.

Unless someone has an immediate explanation as to why it broke the
iwlwifi driver and the actual bug is in iwlwifi and it can be fixed
quickly and easily there, I would suggest that the above commit is
reverted for the time being, as apparently it wasn't fixing anything
but was just a performance optimization.

I am available to do any amount of tests or debugging, given the
guidance.

Thanks,
-- 
Jean Delvare
SUSE L3 Support

^ permalink raw reply

* KMSAN: uninit-value in smsc95xx_wait_eeprom
From: syzbot @ 2019-07-30 12:18 UTC (permalink / raw)
  To: UNGLinuxDriver, davem, glider, linux-kernel, linux-usb, netdev,
	steve.glendinning, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    beaab8a3 fix KASAN build
git tree:       kmsan
console output: https://syzkaller.appspot.com/x/log.txt?x=1685d8bfa00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4db781fe35a84ef5
dashboard link: https://syzkaller.appspot.com/bug?extid=136c17d735f025fc86a7
compiler:       clang version 9.0.0 (/home/glider/llvm/clang  
80fee25776c2fb61e74c1ecb1a523375c2500b69)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+136c17d735f025fc86a7@syzkaller.appspotmail.com

usb 1-1: New USB device found, idVendor=0424, idProduct=9908,  
bcdDevice=6a.5e
usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 1-1: config 0 descriptor??
smsc95xx v1.0.6
==================================================================
BUG: KMSAN: uninit-value in smsc95xx_wait_eeprom+0x1fb/0x3d0  
drivers/net/usb/smsc95xx.c:300
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.2.0+ #15
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x191/0x1f0 lib/dump_stack.c:113
  kmsan_report+0x162/0x2d0 mm/kmsan/kmsan_report.c:109
  __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:294
  smsc95xx_wait_eeprom+0x1fb/0x3d0 drivers/net/usb/smsc95xx.c:300
  smsc95xx_read_eeprom+0x3c2/0x920 drivers/net/usb/smsc95xx.c:357
  smsc95xx_init_mac_address drivers/net/usb/smsc95xx.c:914 [inline]
  smsc95xx_bind+0x467/0x1690 drivers/net/usb/smsc95xx.c:1286
  usbnet_probe+0x10d3/0x3950 drivers/net/usb/usbnet.c:1722
  usb_probe_interface+0xd19/0x1310 drivers/usb/core/driver.c:361
  really_probe+0x1344/0x1d90 drivers/base/dd.c:513
  driver_probe_device+0x1ba/0x510 drivers/base/dd.c:670
  __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:777
  bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
  __device_attach+0x489/0x750 drivers/base/dd.c:843
  device_initial_probe+0x4a/0x60 drivers/base/dd.c:890
  bus_probe_device+0x131/0x390 drivers/base/bus.c:514
  device_add+0x25b5/0x2df0 drivers/base/core.c:2111
  usb_set_configuration+0x309f/0x3710 drivers/usb/core/message.c:2027
  generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
  usb_probe_device+0x146/0x200 drivers/usb/core/driver.c:266
  really_probe+0x1344/0x1d90 drivers/base/dd.c:513
  driver_probe_device+0x1ba/0x510 drivers/base/dd.c:670
  __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:777
  bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
  __device_attach+0x489/0x750 drivers/base/dd.c:843
  device_initial_probe+0x4a/0x60 drivers/base/dd.c:890
  bus_probe_device+0x131/0x390 drivers/base/bus.c:514
  device_add+0x25b5/0x2df0 drivers/base/core.c:2111
  usb_new_device+0x23e5/0x2fb0 drivers/usb/core/hub.c:2534
  hub_port_connect drivers/usb/core/hub.c:5089 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
  port_event drivers/usb/core/hub.c:5350 [inline]
  hub_event+0x5853/0x7320 drivers/usb/core/hub.c:5432
  process_one_work+0x1572/0x1f00 kernel/workqueue.c:2269
  worker_thread+0x111b/0x2460 kernel/workqueue.c:2415
  kthread+0x4b5/0x4f0 kernel/kthread.c:256
  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355

Local variable description: ----buf.i.i@smsc95xx_wait_eeprom
Variable was created at:
  __smsc95xx_read_reg drivers/net/usb/smsc95xx.c:80 [inline]
  smsc95xx_read_reg drivers/net/usb/smsc95xx.c:144 [inline]
  smsc95xx_wait_eeprom+0xb6/0x3d0 drivers/net/usb/smsc95xx.c:294
  smsc95xx_read_eeprom+0x3c2/0x920 drivers/net/usb/smsc95xx.c:357
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply

* [PATCH][net-next][V2] mlxsw: spectrum_ptp: fix duplicated check on orig_egr_types
From: Colin King @ 2019-07-30 11:47 UTC (permalink / raw)
  To: Petr Machata, Jiri Pirko, Ido Schimmel, David S . Miller, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Currently are duplicated checks on orig_egr_types which are
redundant, I believe this is a typo and should actually be
orig_ing_types || orig_egr_types instead of the expression
orig_egr_types || orig_egr_types.  Fix these.

Addresses-Coverity: ("Same on both sides")
Fixes: c6b36bdd04b5 ("mlxsw: spectrum_ptp: Increase parsing depth when PTP is enabled")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
index 98c5ba3200bc..63b07edd9d81 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
@@ -999,14 +999,14 @@ static int mlxsw_sp1_ptp_mtpppc_update(struct mlxsw_sp_port *mlxsw_sp_port,
 		}
 	}
 
-	if ((ing_types || egr_types) && !(orig_egr_types || orig_egr_types)) {
+	if ((ing_types || egr_types) && !(orig_ing_types || orig_egr_types)) {
 		err = mlxsw_sp_nve_inc_parsing_depth_get(mlxsw_sp);
 		if (err) {
 			netdev_err(mlxsw_sp_port->dev, "Failed to increase parsing depth");
 			return err;
 		}
 	}
-	if (!(ing_types || egr_types) && (orig_egr_types || orig_egr_types))
+	if (!(ing_types || egr_types) && (orig_ing_types || orig_egr_types))
 		mlxsw_sp_nve_inc_parsing_depth_put(mlxsw_sp);
 
 	return mlxsw_sp1_ptp_mtpppc_set(mlxsw_sp_port->mlxsw_sp,

--

V2: fix both occurances of this typo. Thanks to Petr Machata for spotting
    the 2nd occurrance
-- 
2.20.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox