Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net] nfp: flower: verify that block cb is not busy before binding
From: David Miller @ 2019-08-20  1:16 UTC (permalink / raw)
  To: vladbu; +Cc: netdev, jhs, xiyou.wangcong, jiri, jakub.kicinski, pablo
In-Reply-To: <20190819073304.9419-1-vladbu@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>
Date: Mon, 19 Aug 2019 10:33:04 +0300

> When processing FLOW_BLOCK_BIND command on indirect block, check that flow
> block cb is not busy.
> 
> Fixes: 0d4fd02e7199 ("net: flow_offload: add flow_block_cb_is_busy() and use it")
> Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>

Applied.

^ permalink raw reply

* Re: [PATCH] Kconfig: Fix the reference to the IDT77105 Phy driver in the description of ATM_NICSTAR_USE_IDT77105
From: David Miller @ 2019-08-20  1:15 UTC (permalink / raw)
  To: christophe.jaillet
  Cc: 3chas3, linux-atm-general, netdev, linux-kernel, kernel-janitors
In-Reply-To: <20190819050425.6119-1-christophe.jaillet@wanadoo.fr>

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date: Mon, 19 Aug 2019 07:04:25 +0200

> This should be IDT77105, not IDT77015.
> 
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] r8152: fix accessing skb after napi_gro_receive
From: David Miller @ 2019-08-20  1:13 UTC (permalink / raw)
  To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel
In-Reply-To: <1394712342-15778-302-Taiwan-albertk@realtek.com>

From: Hayes Wang <hayeswang@realtek.com>
Date: Mon, 19 Aug 2019 11:15:19 +0800

> Fix accessing skb after napi_gro_receive which is caused by
> commit 47922fcde536 ("r8152: support skb_add_rx_frag").
> 
> Fixes: 47922fcde536 ("r8152: support skb_add_rx_frag")
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] sock: fix potential memory leak in proto_register()
From: David Miller @ 2019-08-20  1:12 UTC (permalink / raw)
  To: zhang.lin16
  Cc: ast, daniel, kafai, songliubraving, yhs, willemb, edumazet,
	deepa.kernel, arnd, dh.herrmann, gnault, netdev, linux-kernel,
	bpf, xue.zhihong, wang.yi59, jiang.xuexin
In-Reply-To: <1566178556-46071-1-git-send-email-zhang.lin16@zte.com.cn>

From: zhanglin <zhang.lin16@zte.com.cn>
Date: Mon, 19 Aug 2019 09:35:56 +0800

> If protocols registered exceeded PROTO_INUSE_NR, prot will be
> added to proto_list, but no available bit left for prot in
> proto_inuse_idx.
> 
> Signed-off-by: zhanglin <zhang.lin16@zte.com.cn>

This won't build with CONFIG_PROC_FS disabled.

^ permalink raw reply

* [PATCH] ipvs: change type of delta and previous_delta in ip_vs_seq.
From: zhang kai @ 2019-08-20  0:37 UTC (permalink / raw)
  To: wensong, horms, ja, davem, kuznet, yoshfuji; +Cc: lvs-devel, netdev

In NAT forwarding mode, Applications may decrease the size of packets,
and TCP sequences will get smaller, so both of variables will be negetive
values in this case.

Signed-off-by: zhang kai <zhangkaiheb@126.com>
---
 include/net/ip_vs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 3759167f91f5..de7e75063c7c 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -346,8 +346,8 @@ enum ip_vs_sctp_states {
  */
 struct ip_vs_seq {
 	__u32			init_seq;	/* Add delta from this seq */
-	__u32			delta;		/* Delta in sequence numbers */
-	__u32			previous_delta;	/* Delta in sequence numbers
+	__s32			delta;		/* Delta in sequence numbers */
+	__s32			previous_delta;	/* Delta in sequence numbers
 						 * before last resized pkt */
 };

-- 
2.17.1

^ permalink raw reply related

* [PATCH 1/1] netfilter: nf_tables: fib: Drop IPV6 packages if IPv6 is disabled on boot
From: Leonardo Bras @ 2019-08-20  0:58 UTC (permalink / raw)
  To: netfilter-devel, coreteam, netdev, linux-kernel
  Cc: Leonardo Bras, Pablo Neira Ayuso, Jozsef Kadlecsik,
	Florian Westphal, David S. Miller

If IPv6 is disabled on boot (ipv6.disable=1), but nft_fib_inet ends up
dealing with a IPv6 package, it causes a kernel panic in
fib6_node_lookup_1(), crashing in bad_page_fault.

The panic is caused by trying to deference a very low address (0x38
in ppc64le), due to ipv6.fib6_main_tbl = NULL.
BUG: Kernel NULL pointer dereference at 0x00000038

Fix this behavior by dropping IPv6 packages if !ipv6_mod_enabled().

Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
---
 net/netfilter/nft_fib_inet.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/netfilter/nft_fib_inet.c b/net/netfilter/nft_fib_inet.c
index 465432e0531b..0017afab3c51 100644
--- a/net/netfilter/nft_fib_inet.c
+++ b/net/netfilter/nft_fib_inet.c
@@ -2,6 +2,7 @@
 
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/ipv6.h>
 #include <linux/module.h>
 #include <linux/netlink.h>
 #include <linux/netfilter.h>
@@ -28,6 +29,8 @@ static void nft_fib_inet_eval(const struct nft_expr *expr,
 		}
 		break;
 	case NFPROTO_IPV6:
+		if (!ipv6_mod_enabled())
+			break;
 		switch (priv->result) {
 		case NFT_FIB_RESULT_OIF:
 		case NFT_FIB_RESULT_OIFNAME:
-- 
2.20.1


^ permalink raw reply related

* [PATCH 1/1] net: rds: add service level support in rds-info
From: Zhu Yanjun @ 2019-08-20  0:52 UTC (permalink / raw)
  To: davem, yanjun.zhu, netdev, linux-rdma, rds-devel

From IB specific 7.6.5 SERVICE LEVEL, Service Level (SL)
is used to identify different flows within an IBA subnet.
It is carried in the local route header of the packet.

Before this commit, run "rds-info -I". The output is as
below:
"
RDS IB Connections:
 LocalAddr  RemoteAddr Tos SL  LocalDev               RemoteDev
192.2.95.3  192.2.95.1  2   0  fe80::21:28:1a:39  fe80::21:28:10:b9
192.2.95.3  192.2.95.1  1   0  fe80::21:28:1a:39  fe80::21:28:10:b9
192.2.95.3  192.2.95.1  0   0  fe80::21:28:1a:39  fe80::21:28:10:b9
"
After this commit, the output is as below:
"
RDS IB Connections:
 LocalAddr  RemoteAddr Tos SL  LocalDev               RemoteDev
192.2.95.3  192.2.95.1  2   2  fe80::21:28:1a:39  fe80::21:28:10:b9
192.2.95.3  192.2.95.1  1   1  fe80::21:28:1a:39  fe80::21:28:10:b9
192.2.95.3  192.2.95.1  0   0  fe80::21:28:1a:39  fe80::21:28:10:b9
"

The commit fe3475af3bdf ("net: rds: add per rds connection cache
statistics") adds cache_allocs in struct rds_info_rdma_connection
as below:
struct rds_info_rdma_connection {
...
        __u32           rdma_mr_max;
        __u32           rdma_mr_size;
        __u8            tos;
        __u32           cache_allocs;
 };
The peer struct in rds-tools of struct rds_info_rdma_connection is as
below:
struct rds_info_rdma_connection {
...
        uint32_t        rdma_mr_max;
        uint32_t        rdma_mr_size;
        uint8_t         tos;
        uint8_t         sl;
        uint32_t        cache_allocs;
};
The difference between userspace and kernel is the member variable sl.
In kernel struct, the member variable sl is missing. This will introduce
risks. So it is necessary to use this commit to avoid this risk.

Fixes: fe3475af3bdf ("net: rds: add per rds connection cache statistics")
CC: Joe Jin <joe.jin@oracle.com>
CC: JUNXIAO_BI <junxiao.bi@oracle.com>
Suggested-by: Gerd Rausch <gerd.rausch@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
---
 include/uapi/linux/rds.h |    2 ++
 net/rds/ib.c             |   16 ++++++++++------
 net/rds/ib.h             |    1 +
 net/rds/ib_cm.c          |    3 +++
 net/rds/rdma_transport.c |   10 ++++++++--
 5 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/rds.h b/include/uapi/linux/rds.h
index fd6b5f6..cba368e 100644
--- a/include/uapi/linux/rds.h
+++ b/include/uapi/linux/rds.h
@@ -250,6 +250,7 @@ struct rds_info_rdma_connection {
 	__u32		rdma_mr_max;
 	__u32		rdma_mr_size;
 	__u8		tos;
+	__u8		sl;
 	__u32		cache_allocs;
 };
 
@@ -265,6 +266,7 @@ struct rds6_info_rdma_connection {
 	__u32		rdma_mr_max;
 	__u32		rdma_mr_size;
 	__u8		tos;
+	__u8		sl;
 	__u32		cache_allocs;
 };
 
diff --git a/net/rds/ib.c b/net/rds/ib.c
index ec05d91..45acab2 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -291,7 +291,7 @@ static int rds_ib_conn_info_visitor(struct rds_connection *conn,
 				    void *buffer)
 {
 	struct rds_info_rdma_connection *iinfo = buffer;
-	struct rds_ib_connection *ic;
+	struct rds_ib_connection *ic = conn->c_transport_data;
 
 	/* We will only ever look at IB transports */
 	if (conn->c_trans != &rds_ib_transport)
@@ -301,15 +301,16 @@ static int rds_ib_conn_info_visitor(struct rds_connection *conn,
 
 	iinfo->src_addr = conn->c_laddr.s6_addr32[3];
 	iinfo->dst_addr = conn->c_faddr.s6_addr32[3];
-	iinfo->tos = conn->c_tos;
+	if (ic) {
+		iinfo->tos = conn->c_tos;
+		iinfo->sl = ic->i_sl;
+	}
 
 	memset(&iinfo->src_gid, 0, sizeof(iinfo->src_gid));
 	memset(&iinfo->dst_gid, 0, sizeof(iinfo->dst_gid));
 	if (rds_conn_state(conn) == RDS_CONN_UP) {
 		struct rds_ib_device *rds_ibdev;
 
-		ic = conn->c_transport_data;
-
 		rdma_read_gids(ic->i_cm_id, (union ib_gid *)&iinfo->src_gid,
 			       (union ib_gid *)&iinfo->dst_gid);
 
@@ -329,7 +330,7 @@ static int rds6_ib_conn_info_visitor(struct rds_connection *conn,
 				     void *buffer)
 {
 	struct rds6_info_rdma_connection *iinfo6 = buffer;
-	struct rds_ib_connection *ic;
+	struct rds_ib_connection *ic = conn->c_transport_data;
 
 	/* We will only ever look at IB transports */
 	if (conn->c_trans != &rds_ib_transport)
@@ -337,6 +338,10 @@ static int rds6_ib_conn_info_visitor(struct rds_connection *conn,
 
 	iinfo6->src_addr = conn->c_laddr;
 	iinfo6->dst_addr = conn->c_faddr;
+	if (ic) {
+		iinfo6->tos = conn->c_tos;
+		iinfo6->sl = ic->i_sl;
+	}
 
 	memset(&iinfo6->src_gid, 0, sizeof(iinfo6->src_gid));
 	memset(&iinfo6->dst_gid, 0, sizeof(iinfo6->dst_gid));
@@ -344,7 +349,6 @@ static int rds6_ib_conn_info_visitor(struct rds_connection *conn,
 	if (rds_conn_state(conn) == RDS_CONN_UP) {
 		struct rds_ib_device *rds_ibdev;
 
-		ic = conn->c_transport_data;
 		rdma_read_gids(ic->i_cm_id, (union ib_gid *)&iinfo6->src_gid,
 			       (union ib_gid *)&iinfo6->dst_gid);
 		rds_ibdev = ic->rds_ibdev;
diff --git a/net/rds/ib.h b/net/rds/ib.h
index 303c6ee..f2b558e 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -220,6 +220,7 @@ struct rds_ib_connection {
 	/* Send/Recv vectors */
 	int			i_scq_vector;
 	int			i_rcq_vector;
+	u8			i_sl;
 };
 
 /* This assumes that atomic_t is at least 32 bits */
diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index fddaa09..233f136 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -152,6 +152,9 @@ void rds_ib_cm_connect_complete(struct rds_connection *conn, struct rdma_cm_even
 		  RDS_PROTOCOL_MINOR(conn->c_version),
 		  ic->i_flowctl ? ", flow control" : "");
 
+	/* receive sl from the peer */
+	ic->i_sl = ic->i_cm_id->route.path_rec->sl;
+
 	atomic_set(&ic->i_cq_quiesce, 0);
 
 	/* Init rings and fill recv. this needs to wait until protocol
diff --git a/net/rds/rdma_transport.c b/net/rds/rdma_transport.c
index ff74c4b..28668ad 100644
--- a/net/rds/rdma_transport.c
+++ b/net/rds/rdma_transport.c
@@ -43,6 +43,9 @@
 static struct rdma_cm_id *rds6_rdma_listen_id;
 #endif
 
+/* Per IB specification 7.7.3, service level is a 4-bit field. */
+#define TOS_TO_SL(tos)		((tos) & 0xF)
+
 static int rds_rdma_cm_event_handler_cmn(struct rdma_cm_id *cm_id,
 					 struct rdma_cm_event *event,
 					 bool isv6)
@@ -97,10 +100,13 @@ static int rds_rdma_cm_event_handler_cmn(struct rdma_cm_id *cm_id,
 			struct rds_ib_connection *ibic;
 
 			ibic = conn->c_transport_data;
-			if (ibic && ibic->i_cm_id == cm_id)
+			if (ibic && ibic->i_cm_id == cm_id) {
+				cm_id->route.path_rec[0].sl =
+					TOS_TO_SL(conn->c_tos);
 				ret = trans->cm_initiate_connect(cm_id, isv6);
-			else
+			} else {
 				rds_conn_drop(conn);
+			}
 		}
 		break;
 
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH net] ipv6/addrconf: allow adding multicast addr if IFA_F_MCAUTOJOIN is set
From: David Miller @ 2019-08-20  0:32 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, challa, dsahern, jishi
In-Reply-To: <20190813135232.27146-1-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Tue, 13 Aug 2019 21:52:32 +0800

> The ip address autojoin is not working for IPv6 as ipv6_add_addr()
> will return -EADDRNOTAVAIL when adding a multicast address.
> 
> Reported-by: Jianlin Shi <jishi@redhat.com>
> Fixes: 93a714d6b53d ("multicast: Extend ip address command to enable multicast group join/leave on")
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

I don't understand how all of this works and why ipv6_add_addr(), which
seems designed explicitly to exclude multicast addresses, should accept
them and what all of the possible fallout might be from such a change.

Your commit message is way too terse and makes it impossible to evaluate
your change.  Really, a change of this nature should have a couple paragraphs
of text explaining the existing situation, what is wrong with it, how you
are fixing it, and why you are fixing it that way.

Thanks.

^ permalink raw reply

* [PATCH v2] net/ncsi: Ensure 32-bit boundary for data cksum
From: Terry S. Duncan @ 2019-08-20  0:24 UTC (permalink / raw)
  To: Samuel Mendoza-Jonas, David S . Miller, netdev, linux-kernel
  Cc: openbmc, William Kennington, Joel Stanley, Terry S. Duncan

The NCSI spec indicates that if the data does not end on a 32 bit
boundary, one to three padding bytes equal to 0x00 shall be present to
align the checksum field to a 32-bit boundary.

Signed-off-by: Terry S. Duncan <terry.s.duncan@linux.intel.com>
---
 net/ncsi/ncsi-cmd.c | 2 +-
 net/ncsi/ncsi-rsp.c | 9 ++++++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/ncsi/ncsi-cmd.c b/net/ncsi/ncsi-cmd.c
index 5c3fad8cba57..eab4346b0a39 100644
--- a/net/ncsi/ncsi-cmd.c
+++ b/net/ncsi/ncsi-cmd.c
@@ -54,7 +54,7 @@ static void ncsi_cmd_build_header(struct ncsi_pkt_hdr *h,
 	checksum = ncsi_calculate_checksum((unsigned char *)h,
 					   sizeof(*h) + nca->payload);
 	pchecksum = (__be32 *)((void *)h + sizeof(struct ncsi_pkt_hdr) +
-		    nca->payload);
+		    ALIGN(nca->payload, 4));
 	*pchecksum = htonl(checksum);
 }
 
diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
index 7581bf919885..d876bd55f356 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -47,7 +47,8 @@ static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 	if (ntohs(h->code) != NCSI_PKT_RSP_C_COMPLETED ||
 	    ntohs(h->reason) != NCSI_PKT_RSP_R_NO_ERROR) {
 		netdev_dbg(nr->ndp->ndev.dev,
-			   "NCSI: non zero response/reason code\n");
+			   "NCSI: non zero response/reason code %04xh, %04xh\n",
+			    ntohs(h->code), ntohs(h->reason));
 		return -EPERM;
 	}
 
@@ -55,7 +56,7 @@ static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 	 * sender doesn't support checksum according to NCSI
 	 * specification.
 	 */
-	pchecksum = (__be32 *)((void *)(h + 1) + payload - 4);
+	pchecksum = (__be32 *)((void *)(h + 1) + ALIGN(payload, 4) - 4);
 	if (ntohl(*pchecksum) == 0)
 		return 0;
 
@@ -63,7 +64,9 @@ static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 					   sizeof(*h) + payload - 4);
 
 	if (*pchecksum != htonl(checksum)) {
-		netdev_dbg(nr->ndp->ndev.dev, "NCSI: checksum mismatched\n");
+		netdev_dbg(nr->ndp->ndev.dev,
+			   "NCSI: checksum mismatched; recd: %08x calc: %08x\n",
+			   *pchecksum, htonl(checksum));
 		return -EINVAL;
 	}
 
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH net] ipv6: Fix return value of ipv6_mc_may_pull() for malformed packets
From: David Miller @ 2019-08-20  0:20 UTC (permalink / raw)
  To: sbrivio; +Cc: gnault, haliu, edumazet, linus.luessing, netdev
In-Reply-To: <20190819121252.5fccbef2@redhat.com>

From: Stefano Brivio <sbrivio@redhat.com>
Date: Mon, 19 Aug 2019 12:12:52 +0200

> I don't see this on net.git, but it's in your stable bundle on
> Patchwork. Should I resend? Thanks.

I applied it on my laptop while travelling and never pushed it out so it
just rot there, sorry.

Fixed, should be in 'net' now.

^ permalink raw reply

* [PATCH V40 23/29] bpf: Restrict bpf when kernel lockdown is in confidentiality mode
From: Matthew Garrett @ 2019-08-20  0:17 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, David Howells,
	Alexei Starovoitov, Matthew Garrett, Kees Cook, netdev,
	Chun-Yi Lee, Daniel Borkmann
In-Reply-To: <20190820001805.241928-1-matthewgarrett@google.com>

From: David Howells <dhowells@redhat.com>

bpf_read() and bpf_read_str() could potentially be abused to (eg) allow
private keys in kernel memory to be leaked. Disable them if the kernel
has been locked down in confidentiality mode.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: netdev@vger.kernel.org
cc: Chun-Yi Lee <jlee@suse.com>
cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: James Morris <jmorris@namei.org>
---
 include/linux/security.h     |  1 +
 kernel/trace/bpf_trace.c     | 10 ++++++++++
 security/lockdown/lockdown.c |  1 +
 3 files changed, 12 insertions(+)

diff --git a/include/linux/security.h b/include/linux/security.h
index 0b2529dbf0f4..e604f4c67f03 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -118,6 +118,7 @@ enum lockdown_reason {
 	LOCKDOWN_INTEGRITY_MAX,
 	LOCKDOWN_KCORE,
 	LOCKDOWN_KPROBES,
+	LOCKDOWN_BPF_READ,
 	LOCKDOWN_CONFIDENTIALITY_MAX,
 };
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 1c9a4745e596..33a954c367f3 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -139,8 +139,13 @@ BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_BPF_READ);
+	if (ret < 0)
+		goto out;
+
 	ret = probe_kernel_read(dst, unsafe_ptr, size);
 	if (unlikely(ret < 0))
+out:
 		memset(dst, 0, size);
 
 	return ret;
@@ -566,6 +571,10 @@ BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_BPF_READ);
+	if (ret < 0)
+		goto out;
+
 	/*
 	 * The strncpy_from_unsafe() call will likely not fill the entire
 	 * buffer, but that's okay in this circumstance as we're probing
@@ -577,6 +586,7 @@ BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
 	 */
 	ret = strncpy_from_unsafe(dst, unsafe_ptr, size);
 	if (unlikely(ret < 0))
+out:
 		memset(dst, 0, size);
 
 	return ret;
diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
index 27b2cf51e443..2397772c56bd 100644
--- a/security/lockdown/lockdown.c
+++ b/security/lockdown/lockdown.c
@@ -33,6 +33,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
 	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
 	[LOCKDOWN_KCORE] = "/proc/kcore access",
 	[LOCKDOWN_KPROBES] = "use of kprobes",
+	[LOCKDOWN_BPF_READ] = "use of bpf to read kernel RAM",
 	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
 };
 
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related

* Re: [PATCH net-next 2/6] net: bridge: Populate the pvid flag in br_vlan_get_info
From: Nikolay Aleksandrov @ 2019-08-20  0:12 UTC (permalink / raw)
  To: Vladimir Oltean, f.fainelli, vivien.didelot, andrew, idosch,
	roopa, davem
  Cc: netdev
In-Reply-To: <20190820000002.9776-3-olteanv@gmail.com>

On 8/20/19 2:59 AM, Vladimir Oltean wrote:
> Currently this simplified code snippet fails:
> 
> 	br_vlan_get_pvid(netdev, &pvid);
> 	br_vlan_get_info(netdev, pvid, &vinfo);
> 	ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));
> 
> It is intuitive that the pvid of a netdevice should have the
> BRIDGE_VLAN_INFO_PVID flag set.
> 
> However I can't seem to pinpoint a commit where this behavior was
> introduced. It seems like it's been like that since forever.
> 
> At a first glance it would make more sense to just handle the
> BRIDGE_VLAN_INFO_PVID flag in __vlan_add_flags. However, as Nikolay
> explains:
> 
>   There are a few reasons why we don't do it, most importantly because
>   we need to have only one visible pvid at any single time, even if it's
>   stale - it must be just one. Right now that rule will not be violated
>   by this change, but people will try using this flag and could see two
>   pvids simultaneously. You can see that the pvid code is even using
>   memory barriers to propagate the new value faster and everywhere the
>   pvid is read only once.  That is the reason the flag is set
>   dynamically when dumping entries, too.  A second (weaker) argument
>   against would be given the above we don't want another way to do the
>   same thing, specifically if it can provide us with two pvids (e.g. if
>   walking the vlan list) or if it can provide us with a pvid different
>   from the one set in the vg. [Obviously, I'm talking about RCU
>   pvid/vlan use cases similar to the dumps.  The locked cases are fine.
>   I would like to avoid explaining why this shouldn't be relied upon
>   without locking]
> 
> So instead of introducing the above change and making sure of the pvid
> uniqueness under RCU, simply dynamically populate the pvid flag in
> br_vlan_get_info().
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  net/bridge/br_vlan.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> index f5b2aeebbfe9..bb98984cd27d 100644
> --- a/net/bridge/br_vlan.c
> +++ b/net/bridge/br_vlan.c
> @@ -1281,6 +1281,8 @@ int br_vlan_get_info(const struct net_device *dev, u16 vid,
>  
>  	p_vinfo->vid = vid;
>  	p_vinfo->flags = v->flags;
> +	if (vid == br_get_pvid(vg))
> +		p_vinfo->flags |= BRIDGE_VLAN_INFO_PVID;
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(br_vlan_get_info);
> 

Looks good, thanks!
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>


^ permalink raw reply

* Re: [PATCH v5 12/17] net: sgi: ioc3-eth: use dma-direct for dma allocations
From: Jakub Kicinski @ 2019-08-20  0:07 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jonathan Corbet, Ralf Baechle, Paul Burton, James Hogan,
	Dmitry Torokhov, Lee Jones, David S. Miller, Srinivas Kandagatla,
	Alessandro Zummo, Alexandre Belloni, Greg Kroah-Hartman,
	Jiri Slaby, Evgeniy Polyakov, linux-doc, linux-kernel, linux-mips,
	linux-input, netdev, linux-rtc, linux-serial
In-Reply-To: <20190819163144.3478-13-tbogendoerfer@suse.de>

On Mon, 19 Aug 2019 18:31:35 +0200, Thomas Bogendoerfer wrote:
> @@ -1386,18 +1427,24 @@ static netdev_tx_t ioc3_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  		unsigned long b2 = (data | 0x3fffUL) + 1UL;
>  		unsigned long s1 = b2 - data;
>  		unsigned long s2 = data + len - b2;
> +		dma_addr_t d;
>  
>  		desc->cmd    = cpu_to_be32(len | ETXD_INTWHENDONE |
>  					   ETXD_B1V | ETXD_B2V | w0);
>  		desc->bufcnt = cpu_to_be32((s1 << ETXD_B1CNT_SHIFT) |
>  					   (s2 << ETXD_B2CNT_SHIFT));
> -		desc->p1     = cpu_to_be64(ioc3_map(skb->data, 1));
> -		desc->p2     = cpu_to_be64(ioc3_map((void *)b2, 1));
> +		d = dma_map_single(ip->dma_dev, skb->data, s1, DMA_TO_DEVICE);

You'll need to check the DMA address with dma_mapping_error(dev, addr),
otherwise static checkers will get upset.

> +		desc->p1     = cpu_to_be64(ioc3_map(d, PCI64_ATTR_PREF));
> +		d = dma_map_single(ip->dma_dev, (void *)b2, s1, DMA_TO_DEVICE);
> +		desc->p2     = cpu_to_be64(ioc3_map(d, PCI64_ATTR_PREF));

^ permalink raw reply

* Re: [PATCH v5 11/17] net: sgi: ioc3-eth: no need to stop queue set_multicast_list
From: Jakub Kicinski @ 2019-08-20  0:04 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jonathan Corbet, Ralf Baechle, Paul Burton, James Hogan,
	Dmitry Torokhov, Lee Jones, David S. Miller, Srinivas Kandagatla,
	Alessandro Zummo, Alexandre Belloni, Greg Kroah-Hartman,
	Jiri Slaby, Evgeniy Polyakov, linux-doc, linux-kernel, linux-mips,
	linux-input, netdev, linux-rtc, linux-serial
In-Reply-To: <20190819163144.3478-12-tbogendoerfer@suse.de>

On Mon, 19 Aug 2019 18:31:34 +0200, Thomas Bogendoerfer wrote:
> netif_stop_queue()/netif_wake_qeue() aren't needed for changing
> multicast filters. Use spinlocks instead for proper protection
> of private struct.
> 
> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
> ---
>  drivers/net/ethernet/sgi/ioc3-eth.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c b/drivers/net/ethernet/sgi/ioc3-eth.c
> index d862f28887f9..7f85a3bfef14 100644
> --- a/drivers/net/ethernet/sgi/ioc3-eth.c
> +++ b/drivers/net/ethernet/sgi/ioc3-eth.c
> @@ -1542,8 +1542,7 @@ static void ioc3_set_multicast_list(struct net_device *dev)
>  	struct netdev_hw_addr *ha;
>  	u64 ehar = 0;
>  
> -	netif_stop_queue(dev);				/* Lock out others. */
> -
> +	spin_lock_irq(&ip->ioc3_lock);

What does this lock protect? 🤔 No question that stopping TX queues
makes little sense, but this function is only called from
ndo_set_rx_mode(), so with rtnl_lock held. 

I thought it may protect ip->emcr, but that one is accessed with no
locking from the ioc3_timer() -> ioc3_setup_duplex() path..

>  	if (dev->flags & IFF_PROMISC) {			/* Set promiscuous.  */
>  		ip->emcr |= EMCR_PROMISC;
>  		writel(ip->emcr, &regs->emcr);
> @@ -1572,7 +1571,7 @@ static void ioc3_set_multicast_list(struct net_device *dev)
>  		writel(ip->ehar_l, &regs->ehar_l);
>  	}
>  
> -	netif_wake_queue(dev);			/* Let us get going again. */
> +	spin_unlock_irq(&ip->ioc3_lock);
>  }
>  
>  module_pci_driver(ioc3_driver);


^ permalink raw reply

* [PATCH net-next 6/6] net: dsa: tag_8021q: Restore bridge pvid when enabling vlan_filtering
From: Vladimir Oltean @ 2019-08-20  0:00 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, idosch, roopa, nikolay, davem
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190820000002.9776-1-olteanv@gmail.com>

The bridge core assumes that enabling/disabling vlan_filtering will
translate into the simple toggling of a flag for switchdev drivers.

That is clearly not the case for sja1105, which alters the VLAN table
and the pvids in order to obtain port separation in standalone mode.

So, since the bridge will not call any vlan operation through switchdev
after enabling vlan_filtering, we need to ensure we're in a functional
state ourselves.

Hence read the pvid that the bridge is aware of, and program that into
our ports.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 net/dsa/tag_8021q.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/net/dsa/tag_8021q.c b/net/dsa/tag_8021q.c
index 67a1bc635a7b..6423beb1efcd 100644
--- a/net/dsa/tag_8021q.c
+++ b/net/dsa/tag_8021q.c
@@ -93,6 +93,33 @@ int dsa_8021q_rx_source_port(u16 vid)
 }
 EXPORT_SYMBOL_GPL(dsa_8021q_rx_source_port);
 
+static int dsa_port_restore_pvid(struct dsa_switch *ds, int port)
+{
+	struct bridge_vlan_info vinfo;
+	struct net_device *slave;
+	u16 pvid;
+	int err;
+
+	if (!dsa_is_user_port(ds, port))
+		return 0;
+
+	slave = ds->ports[port].slave;
+
+	err = br_vlan_get_pvid(slave, &pvid);
+	if (err < 0) {
+		dev_err(ds->dev, "Couldn't determine bridge PVID\n");
+		return err;
+	}
+
+	err = br_vlan_get_info(slave, pvid, &vinfo);
+	if (err < 0) {
+		dev_err(ds->dev, "Couldn't determine PVID attributes\n");
+		return err;
+	}
+
+	return dsa_port_vid_add(&ds->ports[port], pvid, vinfo.flags);
+}
+
 /* RX VLAN tagging (left) and TX VLAN tagging (right) setup shown for a single
  * front-panel switch port (here swp0).
  *
@@ -223,7 +250,10 @@ int dsa_port_setup_8021q_tagging(struct dsa_switch *ds, int port, bool enabled)
 		return err;
 	}
 
-	return 0;
+	if (!enabled)
+		err = dsa_port_restore_pvid(ds, port);
+
+	return err;
 }
 EXPORT_SYMBOL_GPL(dsa_port_setup_8021q_tagging);
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 4/6] net: dsa: Don't program the VLAN as pvid on the upstream port
From: Vladimir Oltean @ 2019-08-20  0:00 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, idosch, roopa, nikolay, davem
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190820000002.9776-1-olteanv@gmail.com>

Commit b2f81d304cee ("net: dsa: add CPU and DSA ports as VLAN members")
programs the VLAN from the bridge into the specified port as well as the
upstream port, with the same set of flags.

Consider the typical case of installing pvid 1 on user port 1, pvid 2 on
user port 2, etc. The upstream port would end up having a pvid equal to
the last user port whose pvid was programmed from the bridge. Less than
useful.

So just don't change the pvid of the upstream port and let it be
whatever the driver set it internally to be.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 net/dsa/switch.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 84ab2336131e..02ccc53f1926 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -239,17 +239,21 @@ dsa_switch_vlan_prepare_bitmap(struct dsa_switch *ds,
 			       const struct switchdev_obj_port_vlan *vlan,
 			       const unsigned long *bitmap)
 {
+	struct switchdev_obj_port_vlan v = *vlan;
 	int port, err;
 
 	if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
 		return -EOPNOTSUPP;
 
 	for_each_set_bit(port, bitmap, ds->num_ports) {
-		err = dsa_port_vlan_check(ds, port, vlan);
+		if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
+			v.flags &= ~BRIDGE_VLAN_INFO_PVID;
+
+		err = dsa_port_vlan_check(ds, port, &v);
 		if (err)
 			return err;
 
-		err = ds->ops->port_vlan_prepare(ds, port, vlan);
+		err = ds->ops->port_vlan_prepare(ds, port, &v);
 		if (err)
 			return err;
 	}
@@ -262,10 +266,14 @@ dsa_switch_vlan_add_bitmap(struct dsa_switch *ds,
 			   const struct switchdev_obj_port_vlan *vlan,
 			   const unsigned long *bitmap)
 {
+	struct switchdev_obj_port_vlan v = *vlan;
 	int port;
 
-	for_each_set_bit(port, bitmap, ds->num_ports)
-		ds->ops->port_vlan_add(ds, port, vlan);
+	for_each_set_bit(port, bitmap, ds->num_ports) {
+		if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
+			v.flags &= ~BRIDGE_VLAN_INFO_PVID;
+		ds->ops->port_vlan_add(ds, port, &v);
+	}
 }
 
 static int dsa_switch_vlan_add(struct dsa_switch *ds,
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 5/6] net: dsa: Allow proper internal use of VLANs
From: Vladimir Oltean @ 2019-08-20  0:00 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, idosch, roopa, nikolay, davem
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190820000002.9776-1-olteanv@gmail.com>

Below commit:

commit 2ea7a679ca2abd251c1ec03f20508619707e1749
Author: Andrew Lunn <andrew@lunn.ch>
Date:   Tue Nov 7 00:04:24 2017 +0100

    net: dsa: Don't add vlans when vlan filtering is disabled

    The software bridge can be build with vlan filtering support
    included. However, by default it is turned off. In its turned off
    state, it still passes VLANs via switchev, even though they are not to
    be used. Don't pass these VLANs to the hardware. Only do so when vlan
    filtering is enabled.

    This fixes at least one corner case. There are still issues in other
    corners, such as when vlan_filtering is later enabled.

    Signed-off-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>

stubs out SWITCHDEV_OBJ_ID_PORT_VLAN objects notified by the bridge core
to DSA drivers when vlan_filtering is 0.

This is generally a good thing, because it allows dsa_8021q to make
private use of VLANs in that mode.

So it makes sense to move the check for the bridge presence and
vlan_filtering setting one layer above. We don't want calls from
dsa_8021q to be prevented by this, only from the bridge core.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 net/dsa/port.c  | 10 ++--------
 net/dsa/slave.c |  8 ++++++++
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index f75301456430..c1cc41c1eeb6 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -348,10 +348,7 @@ int dsa_port_vlan_add(struct dsa_port *dp,
 		.vlan = vlan,
 	};
 
-	if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
-		return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);
-
-	return 0;
+	return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);
 }
 
 int dsa_port_vlan_del(struct dsa_port *dp,
@@ -363,10 +360,7 @@ int dsa_port_vlan_del(struct dsa_port *dp,
 		.vlan = vlan,
 	};
 
-	if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
-		return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
-
-	return 0;
+	return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
 }
 
 int dsa_port_vid_add(struct dsa_port *dp, u16 vid, u16 flags)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 33f41178afcc..6a02eb8988d1 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -341,6 +341,8 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		if (obj->orig_dev != dev)
 			return -EOPNOTSUPP;
+		if (dp->bridge_dev && !br_vlan_enabled(dp->bridge_dev))
+			return 0;
 		err = dsa_port_vlan_add(dp, SWITCHDEV_OBJ_PORT_VLAN(obj),
 					trans);
 		break;
@@ -373,6 +375,8 @@ static int dsa_slave_port_obj_del(struct net_device *dev,
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
 		if (obj->orig_dev != dev)
 			return -EOPNOTSUPP;
+		if (dp->bridge_dev && !br_vlan_enabled(dp->bridge_dev))
+			return 0;
 		err = dsa_port_vlan_del(dp, SWITCHDEV_OBJ_PORT_VLAN(obj));
 		break;
 	default:
@@ -1073,6 +1077,8 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
 	 * need to emulate the switchdev prepare + commit phase.
 	 */
 	if (dp->bridge_dev) {
+		if (!br_vlan_enabled(dp->bridge_dev))
+			return 0;
 		/* br_vlan_get_info() returns -EINVAL or -ENOENT if the
 		 * device, respectively the VID is not found, returning
 		 * 0 means success, which is a failure for us here.
@@ -1097,6 +1103,8 @@ static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
 	 * need to emulate the switchdev prepare + commit phase.
 	 */
 	if (dp->bridge_dev) {
+		if (!br_vlan_enabled(dp->bridge_dev))
+			return 0;
 		/* br_vlan_get_info() returns -EINVAL or -ENOENT if the
 		 * device, respectively the VID is not found, returning
 		 * 0 means success, which is a failure for us here.
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 3/6] net: dsa: Delete the VID from the upstream port as well
From: Vladimir Oltean @ 2019-08-19 23:59 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, idosch, roopa, nikolay, davem
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190820000002.9776-1-olteanv@gmail.com>

Commit b2f81d304cee ("net: dsa: add CPU and DSA ports as VLAN members")
is littering a lot. After deleting a VLAN added on a DSA port, it still
remains installed in the hardware filter of the upstream port. Fix this.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 net/dsa/switch.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 09d9286b27cc..84ab2336131e 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -295,11 +295,20 @@ static int dsa_switch_vlan_del(struct dsa_switch *ds,
 			       struct dsa_notifier_vlan_info *info)
 {
 	const struct switchdev_obj_port_vlan *vlan = info->vlan;
+	int port;
 
 	if (!ds->ops->port_vlan_del)
 		return -EOPNOTSUPP;
 
+	/* Build a mask of VLAN members */
+	bitmap_zero(ds->bitmap, ds->num_ports);
 	if (ds->index == info->sw_index)
+		set_bit(info->port, ds->bitmap);
+	for (port = 0; port < ds->num_ports; port++)
+		if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
+			set_bit(port, ds->bitmap);
+
+	for_each_set_bit(port, ds->bitmap, ds->num_ports)
 		return ds->ops->port_vlan_del(ds, info->port, vlan);
 
 	return 0;
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 1/6] net: dsa: tag_8021q: Future-proof the reserved fields in the custom VID
From: Vladimir Oltean @ 2019-08-19 23:59 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, idosch, roopa, nikolay, davem
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190820000002.9776-1-olteanv@gmail.com>

After witnessing the discussion in https://lkml.org/lkml/2019/8/14/151
w.r.t. ioctl extensibility, it became clear that such an issue might
prevent that the 3 RSV bits inside the DSA 802.1Q tag might also suffer
the same fate and be useless for further extension.

So clearly specify that the reserved bits should currently be
transmitted as zero and ignored on receive. The DSA tagger already does
this (and has always did), and is the only known user so far (no
Wireshark dissection plugin, etc). So there should be no incompatibility
to speak of.

Fixes: 0471dd429cea ("net: dsa: tag_8021q: Create a stable binary format")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 net/dsa/tag_8021q.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/dsa/tag_8021q.c b/net/dsa/tag_8021q.c
index 6ebbd799c4eb..67a1bc635a7b 100644
--- a/net/dsa/tag_8021q.c
+++ b/net/dsa/tag_8021q.c
@@ -28,6 +28,7 @@
  *
  * RSV - VID[9]:
  *	To be used for further expansion of SWITCH_ID or for other purposes.
+ *	Must be transmitted as zero and ignored on receive.
  *
  * SWITCH_ID - VID[8:6]:
  *	Index of switch within DSA tree. Must be between 0 and
@@ -35,6 +36,7 @@
  *
  * RSV - VID[5:4]:
  *	To be used for further expansion of PORT or for other purposes.
+ *	Must be transmitted as zero and ignored on receive.
  *
  * PORT - VID[3:0]:
  *	Index of switch port. Must be between 0 and DSA_MAX_PORTS - 1.
-- 
2.17.1

^ permalink raw reply related

* [PATCH net-next 2/6] net: bridge: Populate the pvid flag in br_vlan_get_info
From: Vladimir Oltean @ 2019-08-19 23:59 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, idosch, roopa, nikolay, davem
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190820000002.9776-1-olteanv@gmail.com>

Currently this simplified code snippet fails:

	br_vlan_get_pvid(netdev, &pvid);
	br_vlan_get_info(netdev, pvid, &vinfo);
	ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));

It is intuitive that the pvid of a netdevice should have the
BRIDGE_VLAN_INFO_PVID flag set.

However I can't seem to pinpoint a commit where this behavior was
introduced. It seems like it's been like that since forever.

At a first glance it would make more sense to just handle the
BRIDGE_VLAN_INFO_PVID flag in __vlan_add_flags. However, as Nikolay
explains:

  There are a few reasons why we don't do it, most importantly because
  we need to have only one visible pvid at any single time, even if it's
  stale - it must be just one. Right now that rule will not be violated
  by this change, but people will try using this flag and could see two
  pvids simultaneously. You can see that the pvid code is even using
  memory barriers to propagate the new value faster and everywhere the
  pvid is read only once.  That is the reason the flag is set
  dynamically when dumping entries, too.  A second (weaker) argument
  against would be given the above we don't want another way to do the
  same thing, specifically if it can provide us with two pvids (e.g. if
  walking the vlan list) or if it can provide us with a pvid different
  from the one set in the vg. [Obviously, I'm talking about RCU
  pvid/vlan use cases similar to the dumps.  The locked cases are fine.
  I would like to avoid explaining why this shouldn't be relied upon
  without locking]

So instead of introducing the above change and making sure of the pvid
uniqueness under RCU, simply dynamically populate the pvid flag in
br_vlan_get_info().

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 net/bridge/br_vlan.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index f5b2aeebbfe9..bb98984cd27d 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -1281,6 +1281,8 @@ int br_vlan_get_info(const struct net_device *dev, u16 vid,

 	p_vinfo->vid = vid;
 	p_vinfo->flags = v->flags;
+	if (vid == br_get_pvid(vg))
+		p_vinfo->flags |= BRIDGE_VLAN_INFO_PVID;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(br_vlan_get_info);
-- 
2.17.1

^ permalink raw reply related

* [PATCH net-next 0/6] Dynamic toggling of vlan_filtering for SJA1105 DSA
From: Vladimir Oltean @ 2019-08-19 23:59 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, idosch, roopa, nikolay, davem
  Cc: netdev, Vladimir Oltean

This patchset addresses a few limitations in DSA and the bridge core
that made it impossible for this sequence of commands to work:

  ip link add name br0 type bridge
  ip link set dev swp2 master br0
  echo 1 > /sys/class/net/br0/bridge/vlan_filtering

Only this sequence was previously working:

  ip link add name br0 type bridge vlan_filtering 1
  ip link set dev swp2 master br0

On SJA1105, the situation is further complicated by the fact that
toggling vlan_filtering is causing a switch reset. However, the hardware
state restoration logic is already there in the driver. It is a matter
of the layers above which need a few fixups.

Also see this discussion thread:
https://www.spinics.net/lists/netdev/msg581042.html

Patch 1/6 is not functionally related but also related to dsa_8021q
handling of VLANs and this is a good opportunity to bring up the subject
for discussion.

Vladimir Oltean (6):
  net: dsa: tag_8021q: Future-proof the reserved fields in the custom
    VID
  net: bridge: Populate the pvid flag in br_vlan_get_info
  net: dsa: Delete the VID from the upstream port as well
  net: dsa: Don't program the VLAN as pvid on the upstream port
  net: dsa: Allow proper internal use of VLANs
  net: dsa: tag_8021q: Restore bridge pvid when enabling vlan_filtering

 net/bridge/br_vlan.c |  2 ++
 net/dsa/port.c       | 10 ++--------
 net/dsa/slave.c      |  8 ++++++++
 net/dsa/switch.c     | 25 +++++++++++++++++++++----
 net/dsa/tag_8021q.c  | 34 +++++++++++++++++++++++++++++++++-
 5 files changed, 66 insertions(+), 13 deletions(-)

-- 
2.17.1

^ permalink raw reply

* Re: [net-next v2 04/14] ice: fix set pause param autoneg check
From: David Miller @ 2019-08-19 23:59 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: jeffrey.t.kirsher, paul.greenwalt, netdev, nhorman, sassmann,
	andrewx.bowers
In-Reply-To: <20190819161142.6f4cc14d@cakuba.netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Mon, 19 Aug 2019 16:11:42 -0700

> On Mon, 19 Aug 2019 09:16:58 -0700, Jeff Kirsher wrote:
>> +	pcaps = devm_kzalloc(&vsi->back->pdev->dev, sizeof(*pcaps),
>> +			     GFP_KERNEL);
>> +	if (!pcaps)
>> +		return -ENOMEM;
>> +
>> +	/* Get current PHY config */
>> +	status = ice_aq_get_phy_caps(pi, false, ICE_AQC_REPORT_SW_CFG, pcaps,
>> +				     NULL);
>> +	if (status) {
>> +		devm_kfree(&vsi->back->pdev->dev, pcaps);
>> +		return -EIO;
>> +	}
>> +
>> +	is_an = ((pcaps->caps & ICE_AQC_PHY_AN_MODE) ?
>> +			AUTONEG_ENABLE : AUTONEG_DISABLE);
>> +
>> +	devm_kfree(&vsi->back->pdev->dev, pcaps);
> 
> Is it just me or is this use of devm_k*alloc absolutely pointless?

Yeah it looks like an overzealous use of these interfaces to me too.

^ permalink raw reply

* Re: [PATCH v5 10/17] net: sgi: ioc3-eth: rework skb rx handling
From: Jakub Kicinski @ 2019-08-19 23:55 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jonathan Corbet, Ralf Baechle, Paul Burton, James Hogan,
	Dmitry Torokhov, Lee Jones, David S. Miller, Srinivas Kandagatla,
	Alessandro Zummo, Alexandre Belloni, Greg Kroah-Hartman,
	Jiri Slaby, Evgeniy Polyakov, linux-doc, linux-kernel, linux-mips,
	linux-input, netdev, linux-rtc, linux-serial
In-Reply-To: <20190819163144.3478-11-tbogendoerfer@suse.de>

On Mon, 19 Aug 2019 18:31:33 +0200, Thomas Bogendoerfer wrote:
> Buffers alloacted by alloc_skb() are already cache aligned so there
> is no need for an extra align done by ioc3_alloc_skb. And instead
> of skb_put/skb_trim simply use one skb_put after frame size is known
> during receive.
> 
> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
> ---
>  drivers/net/ethernet/sgi/ioc3-eth.c | 50 ++++++++-----------------------------
>  1 file changed, 11 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c b/drivers/net/ethernet/sgi/ioc3-eth.c
> index c875640926d6..d862f28887f9 100644
> --- a/drivers/net/ethernet/sgi/ioc3-eth.c
> +++ b/drivers/net/ethernet/sgi/ioc3-eth.c
> @@ -11,7 +11,6 @@
>   *
>   * To do:
>   *
> - *  o Handle allocation failures in ioc3_alloc_skb() more gracefully.
>   *  o Handle allocation failures in ioc3_init_rings().
>   *  o Use prefetching for large packets.  What is a good lower limit for
>   *    prefetching?
> @@ -72,6 +71,12 @@
>  #define TX_RING_ENTRIES		128
>  #define TX_RING_MASK		(TX_RING_ENTRIES - 1)
>  
> +/* BEWARE: The IOC3 documentation documents the size of rx buffers as
> + * 1644 while it's actually 1664.  This one was nasty to track down...
> + */
> +#define RX_OFFSET		10
> +#define RX_BUF_SIZE		1664
> +
>  #define ETCSR_FD   ((17 << ETCSR_IPGR2_SHIFT) | (11 << ETCSR_IPGR1_SHIFT) | 21)
>  #define ETCSR_HD   ((21 << ETCSR_IPGR2_SHIFT) | (21 << ETCSR_IPGR1_SHIFT) | 21)
>  
> @@ -111,31 +116,6 @@ static void ioc3_init(struct net_device *dev);
>  static const char ioc3_str[] = "IOC3 Ethernet";
>  static const struct ethtool_ops ioc3_ethtool_ops;
>  
> -/* We use this to acquire receive skb's that we can DMA directly into. */
> -
> -#define IOC3_CACHELINE	128UL

Is the cache line on the platform this driver works on 128B?
This looks like a DMA engine alignment requirement, more than an
optimization.

The comment in __alloc_skb() says:

	/* We do our best to align skb_shared_info on a separate cache
	 * line. It usually works because kmalloc(X > SMP_CACHE_BYTES) gives
	 * aligned memory blocks, unless SLUB/SLAB debug is enabled.
	 * Both skb->head and skb_shared_info are cache line aligned.
	 */

note the "unless".

> -static inline unsigned long aligned_rx_skb_addr(unsigned long addr)
> -{
> -	return (~addr + 1) & (IOC3_CACHELINE - 1UL);
> -}
> -
> -static inline struct sk_buff *ioc3_alloc_skb(unsigned long length,
> -					     unsigned int gfp_mask)
> -{
> -	struct sk_buff *skb;
> -
> -	skb = alloc_skb(length + IOC3_CACHELINE - 1, gfp_mask);
> -	if (likely(skb)) {
> -		int offset = aligned_rx_skb_addr((unsigned long)skb->data);
> -
> -		if (offset)
> -			skb_reserve(skb, offset);
> -	}
> -
> -	return skb;
> -}
> -
>  static inline unsigned long ioc3_map(void *ptr, unsigned long vdev)
>  {
>  #ifdef CONFIG_SGI_IP27
> @@ -148,12 +128,6 @@ static inline unsigned long ioc3_map(void *ptr, unsigned long vdev)
>  #endif
>  }
>  
> -/* BEWARE: The IOC3 documentation documents the size of rx buffers as
> - * 1644 while it's actually 1664.  This one was nasty to track down ...
> - */
> -#define RX_OFFSET		10
> -#define RX_BUF_ALLOC_SIZE	(1664 + RX_OFFSET + IOC3_CACHELINE)
> -
>  #define IOC3_SIZE 0x100000
>  
>  static inline u32 mcr_pack(u32 pulse, u32 sample)
> @@ -534,10 +508,10 @@ static inline void ioc3_rx(struct net_device *dev)
>  		err = be32_to_cpu(rxb->err);		/* It's valid ...  */
>  		if (err & ERXBUF_GOODPKT) {
>  			len = ((w0 >> ERXBUF_BYTECNT_SHIFT) & 0x7ff) - 4;
> -			skb_trim(skb, len);
> +			skb_put(skb, len);
>  			skb->protocol = eth_type_trans(skb, dev);
>  
> -			new_skb = ioc3_alloc_skb(RX_BUF_ALLOC_SIZE, GFP_ATOMIC);
> +			new_skb = alloc_skb(RX_BUF_SIZE, GFP_ATOMIC);
>  			if (!new_skb) {
>  				/* Ouch, drop packet and just recycle packet
>  				 * to keep the ring filled.
> @@ -546,6 +520,7 @@ static inline void ioc3_rx(struct net_device *dev)
>  				new_skb = skb;
>  				goto next;
>  			}
> +			new_skb->dev = dev;

Assigning dev pointer seems unrelated to the rest of the patch?

>  			if (likely(dev->features & NETIF_F_RXCSUM))
>  				ioc3_tcpudp_checksum(skb,
> @@ -556,8 +531,6 @@ static inline void ioc3_rx(struct net_device *dev)
>  
>  			ip->rx_skbs[rx_entry] = NULL;	/* Poison  */
>  
> -			/* Because we reserve afterwards. */
> -			skb_put(new_skb, (1664 + RX_OFFSET));
>  			rxb = (struct ioc3_erxbuf *)new_skb->data;
>  			skb_reserve(new_skb, RX_OFFSET);
>  
> @@ -846,16 +819,15 @@ static void ioc3_alloc_rings(struct net_device *dev)
>  		for (i = 0; i < RX_BUFFS; i++) {
>  			struct sk_buff *skb;
>  
> -			skb = ioc3_alloc_skb(RX_BUF_ALLOC_SIZE, GFP_ATOMIC);
> +			skb = alloc_skb(RX_BUF_SIZE, GFP_ATOMIC);
>  			if (!skb) {
>  				show_free_areas(0, NULL);
>  				continue;
>  			}
> +			skb->dev = dev;
>  
>  			ip->rx_skbs[i] = skb;
>  
> -			/* Because we reserve afterwards. */
> -			skb_put(skb, (1664 + RX_OFFSET));
>  			rxb = (struct ioc3_erxbuf *)skb->data;
>  			rxr[i] = cpu_to_be64(ioc3_map(rxb, 1));
>  			skb_reserve(skb, RX_OFFSET);


^ permalink raw reply

* Re: [PATCH v5 09/17] net: sgi: ioc3-eth: use defines for constants dealing with desc rings
From: Jakub Kicinski @ 2019-08-19 23:53 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jonathan Corbet, Ralf Baechle, Paul Burton, James Hogan,
	Dmitry Torokhov, Lee Jones, David S. Miller, Srinivas Kandagatla,
	Alessandro Zummo, Alexandre Belloni, Greg Kroah-Hartman,
	Jiri Slaby, Evgeniy Polyakov, linux-doc, linux-kernel, linux-mips,
	linux-input, netdev, linux-rtc, linux-serial
In-Reply-To: <20190819163144.3478-10-tbogendoerfer@suse.de>

On Mon, 19 Aug 2019 18:31:32 +0200, Thomas Bogendoerfer wrote:
> Descriptor ring sizes of the IOC3 are more or less fixed size. To
> make clearer where there is a relation to ring sizes use defines.
> 
> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>

^ permalink raw reply

* Re: [PATCH v5 00/17] Use MFD framework for SGI IOC3 drivers
From: Jakub Kicinski @ 2019-08-19 23:51 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jonathan Corbet, Ralf Baechle, Paul Burton, James Hogan,
	Dmitry Torokhov, Lee Jones, David S. Miller, Srinivas Kandagatla,
	Alessandro Zummo, Alexandre Belloni, Greg Kroah-Hartman,
	Jiri Slaby, Evgeniy Polyakov, linux-doc, linux-kernel, linux-mips,
	linux-input, netdev, linux-rtc, linux-serial
In-Reply-To: <20190819163144.3478-1-tbogendoerfer@suse.de>

On Mon, 19 Aug 2019 18:31:23 +0200, Thomas Bogendoerfer wrote:
>  - requested by Jakub I've splitted ioc3 ethernet driver changes into
>    more steps to make the transition more visible; 

Thanks a lot for doing that!

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox